PDF and Its Application in Electronic Publishing
Yang Daoliang Chang Ming Ren Xiaoxia
1 PDF Overview
PDF (Portable document format) is a structured document format. It was first published by Adobe Corporation, a well-known publishing and image processing software company in the United States, in 1993 (Version 1.0), and Adobe launched the corresponding supporting software product series Adobe Acrobat 1.0 in the same year; subsequently Adobe revised it and upgraded it in 1994. Version 1.1 was released and the supporting software product series Adobe Acrobat 2.0 and 2.1 were introduced. The latest version 1.2 of PDF was released on November 27, 1996, and the corresponding supporting software product series Adobe Acrobat was also upgraded to version 3.0. At the end of 1997, the International Organization for Standardization had begun to consider the acceptance of PDF as an international standard.
1.1 Comparison between PDF and PS
Postscript (PS) is also a de facto printing industry standard owned by Adobe. It can describe beautiful layouts and still dominate the current printing field. PDFs have evolved from PS, and they have almost the same capabilities and similar description methods for describing pages. The PDF uses the same Imaging Mode as PS to represent text and graphics. Like the PS language, PDF's page description instruction draws the page by coloring the selected area. The colored area can be the area defined by letters outlines, lines and curves, and the bitmap; the colored color can be arbitrary; any graphic on the page can be cropped to other shapes; the page is initially empty, with various instructions Draw different graphics onto the page. The new graphics are opaque and can overwrite the old graphics.
However, PDF is still quite different from PS. Mainly in the following aspects:
PDF files can contain interactive objects such as hyperlinks, interactive forms, and more. And PS does not.
PDF is a file structure, and PS is a programming language. Therefore PDF has higher processing efficiency than PS.
The strict structure definition of PDF allows the application to randomly access objects, while PS can only be accessed sequentially. For example, to access page 100 in a PS file, you must first explain the first 99 pages before you can find page 100, and access to each page in the PDF is just as fast.
The PDF contains the font description information such as the size of the font, so that when the font does not exist, the font emulation (instead of a simple font substitution) is performed to ensure the consistency of the document display.
1.2 Comparison between PDF and html
Html is an application of SGML (Standard Generalized Markup Language) and is the main form of information publishing on the internet. It can describe the basic style of a web page, with illustrations and text, and has interactive and hyperlink functions. It can have some processing capabilities with Java or script, and can also interact with the server through cgi. PDF, like html, also has form interaction and hyperlinks, suitable for posting information online. But unlike html, PDF also has the ability to describe beautiful layouts. PDF realizes the unification of paper printing and electronic publishing. After the typesetting content is saved as a PDF file, the network can be released at the same time as the printing is delivered (additional interactive content needs to be added). It is not necessary to have two groups of people and a set of typesetting for paper printing as in the current practice. Generate PS, another set of html documents for electronic publishing, resulting in waste of resources and manpower, low productivity.
In addition to the lack of layout description capabilities, html often shows inconsistencies in information (eg different platforms, different browsers, and the same web page seen by browser windows of different sizes may look different). However, this problem has been well solved in the PDF.
1.3 Features of PDF
The features of PDF are summarized as follows:
Transmissibility. The PDF file supports 7-bit Ascii code and binary encoding, which can be correctly transmitted in various network environments.
Platform independence. PDF files have both hardware and software platform independence. The format and content of the PDF files that the user sees in different environments (such as different language operating systems and different hardware platforms) are exactly the same as when the author was created. This feature is very suitable for information exchange, eliminating the trouble of garbled code.
Font independence. The PDF file can have its own font or font description information, and it can still display correctly if the user's system lacks the required font.
Supports multiple compression and encoding methods, and the files are more compact. Compression, encoding methods are: Asciihex, scii85, lzw, runLength, ccitt group3, ccitt group 4, jpeg, flate.
Supports interactive operations. May contain interactive forms and hyperlinks. Support sound, animation.
Supports random access to pages.
Supports the continuous addition of modifications to facilitate minor modifications and improve efficiency.
Security control. Supports various levels of security, such as reading only, not printing and selecting text, being readable, printable, but not modifiable, readable, printable, modifiable, etc. This kind of security control is very important to protect the copyright of electronic publications.
2 PDF Structure 2.1 PDF File Structure
The PDF file structure (ie, the physical structure) consists of four parts: the file header, the file body, the cross-reference table, and the end of the file. See Figure 1.
The header indicates the version number of the PDF specification that the file complies with. It appears in the first line of the PDF file. Such as %PDF-1.2 indicates that the file format conforms to the PDF 1.2 specification.
The file body consists of a series of indirectobj objects. These indirect objects constitute the specific content of the PDF file such as fonts, pages, images, and so on.
The cross-reference table is an indirect object address index table set up to allow random access to indirect objects.
The end of the file declares the address of the cross-reference table, indicates the root object (catalong) of the file body, and also stores security information such as encryption.
Based on the information provided at the end of the file, the PDF application can find the cross-reference table and the root object of the entire PDF file, thus controlling the entire PDF file.
2.2 PDF document structure
The PDF document structure is the logical organization structure of the PDF file content. It reflects the hierarchical relationship between indirect objects in the file body. The PDF document structure is a tree structure, as shown in Figure 2. The root node of the tree is the root object of the PDF file. There are four sub-trees under the root node: Pages tree, out line tree, Article th reads, and named destination. In the page tree, all page objects are in the leaf node of the tree. The child nodes in the tree inherit the attribute values ​​of the parent node as the default values ​​of the corresponding attributes. In the bookmark tree, the bookmarks (booKmarK) are organized in a tree hierarchy relationship. The bookmark establishes the association of the book signature with the location on a specific page, which allows the user to access the contents of the document by bookmark name. Because bookmarks can be hierarchical and can be used to organize documents, sometimes they are called directory trees. The clue tree organizes the article bead under the article clues and clues in a tree structure. An article block is a pre-defined area on a page. It is generally a piece of text or an image that is of interest to the reader. Its purpose is to make the entire visible area display only this specific area and avoid interference with other parts of the page. The article thread concatenates pre-defined pieces of articles. If the reader reads articles according to the articles, the browser only displays the pieces of articles in the thread in order, so that the reader can read only the content that he is interested in, without having to press Read sequentially. As for the name tree, a correspondence relationship between a string (name) and a page area is established. The leaf node in the tree stores the string and the corresponding page area, while the non-leaf node is only an index so that the application can quickly Access to leaf nodes. The role of the name tree is to allow other objects in the PDF file to represent a page area with a string name.
2.3 Resources in PDF
The page content (such as text, graphics, and images) in the PDF is stored in the stream object (hereinafter simply referred to as content stream) corresponding to the content keyword of the page object. There are many basic objects such as numbers and strings used in the content stream. These are represented by direct objects. However, there are other objects such as fonts, which are represented by dictionary objects or streams. They cannot be represented by direct objects, and no indirect objects can appear in the content stream (otherwise, they cannot be related to the content itself. Data is distinguished), so these objects are named and they are represented in the content stream with corresponding names. These objects represented by names are called named resources.
There is a resource skey in the page object, which lists all the resources used in the content stream, and establishes a mapping table between the resource name and the resource object itself.
The named resources in the PDF are: Procset, font, color space, external objects (xobJect <include image, form, and Pssegment>), extended graphics state, and bottom Pattern, user extension list (Property list).
Unnamed resources include: encoding, font Descriptor, halftone, fuction, and CMAP. Since unnamed resources are implicitly referenced, there is no need for naming.
2.4 PDF Page Description Instructions
PDF has a total of 60 page description instructions. These 60 page description instructions describe a series of graphic objects on the page. These graphic objects can be divided into four categories: Path objects, text objects, image objects, and external objects. See Figure 3. It is the basic element that forms all pages.
3 Generation of PDF files
There are two ways to generate PDF now:
· PDF generation by printing, is to use a virtual PDF printer to convert the application's text and graphics instructions (such as the gdi command under Windows or the QuicK-Draw command under Mac) to PDF and save it in the PDF file. See Figure 4. After installing the Adobe Acrobat PDF writer, it is theoretically possible for all applications with printing capabilities to print the content to be printed into a PDF file. However, there are still many problems in generating Chinese PDF files.
Converting from PS to PDF is another method of generating PDF. It is the application that first distributes the content to be printed to the PS file, and the Adobe Acrobat Distiller converts the PS file to a PDF file. See FIG. 5.
There are advantages and disadvantages to the two methods of generating PDFs. The advantage of PDF generated by printing is that it can be closely combined with the application. It appears to the user to generate PDF directly from the application, but the disadvantage is that it is difficult to generate high-precision PDF due to limitations of the gdi instruction set and the QuicK-Draw instruction set itself. . Although there is one more process for converting from PS to PDF, because of its high-precision description capability, the resulting PDF can achieve print-level quality and accuracy.
generate
$(function(){ $("#article img").attr("style","display: block;margin: 0 auto;"); })
1. All texts, pictures, audio and video presentations of “Source: China Packaging Network†are marked on this website, and the copyright is exclusively owned by “China Packaging Networkâ€. If you need to reprint, please indicate the source. Any media, website, or individual must indicate "source: China Packaging Network" when reprinting. Violators of this site will be held accountable according to law.
2. The manuscript reproduced and noted by other sources is intended to convey more information for readers and does not imply endorsement of its views or the authenticity of its contents. When other media, websites, or individuals re-publish from this site, they must retain the source of the manuscript noted in this site, and must not arbitrarily tamper with the origin of the manuscript and take legal responsibility.
3. If the reprinted version of this website involves copyright issues, please contact China Packaging Network immediately or by email.
Contact information
×
Currently, Artificial Turf has been well-known and widely used in the sports centers, commercial squares, green parks, gardens, as well as the roof decoration, it is beautiful and environmental-friendly. interlocking Artificial Grass tile, a patented product and masterpiece of City green, it is incorporated with both artificial grass and plastic tile base, and with the popular size, making the grass freely detachable, combinable and removable, meanwhile easily set up onto the ground as movable activity areas or leisure places.
Technical parameters of Interlocking Grass
Pile height: 25mm-50mm
Gauge (inch): 3/8"
Stitch: 20 stitches per 10cm
Dtex: 14000
Interlocking Grass
Interlocking Grass,Cricket Grass,Interlock Artificial Grass,Portable Interlocking Grass
Shijiazhuang Sothink Trading Co., Ltd. , http://www.chinasothink.com