ࡱ > Y } bjbjWW " = = y ; ]
( ( ( X X X X 8 T X x
( z 5 w w w w w w w $ y { x ( x }
} } } f 8 ( w < J w } N } % p ( w | $IOX X g (w
XHTML(-Print
Draft 0.45
F. D. Wright
Director, Strategic and Technical Alliances
Lexmark International
HYPERLINK mailto:don@lexmark.com don@lexmark.com
Melinda Grant
Hewlett-Packard
HYPERLINK "mailto:melinda_grant@hp.com" melinda_grant@hp.com
Peter Zehler
Xerox
HYPERLINK mailto:peter.zehler@usa.xerox.com peter.zehler@usa.xerox.com
SAVEDATE \@ "MMMM d, yyyy" \* MERGEFORMAT November 1, 2000
Table of Contents
TOC \o "1-6" 1 Overview PAGEREF _Toc497733439 \h 0
2 References PAGEREF _Toc497733440 \h 0
3 XHTML-Print Tags PAGEREF _Toc497733441 \h 0
3.1 Tags Required by XHTML-Print PAGEREF _Toc497733442 \h 0
3.2 Restrictions on XHTML by XHTML-Print PAGEREF _Toc497733443 \h 0
3.3 CSS Conformance PAGEREF _Toc497733444 \h 0
4 Changes to XHTML for XHTML-Print PAGEREF _Toc497733445 \h 0
4.1 Recommended attributes on the and tags PAGEREF _Toc497733446 \h 0
4.2 Page Breaks PAGEREF _Toc497733447 \h 0
4.3 Page Size and Orientation PAGEREF _Toc497733448 \h 0
4.3.1 Size Property PAGEREF _Toc497733449 \h 0
4.3.2 Margin Property PAGEREF _Toc497733450 \h 0
4.3.3 Examples PAGEREF _Toc497733451 \h 0
4.3.4 Rendering page boxes that do not fit a target sheet PAGEREF _Toc497733452 \h 0
4.3.5 Positioning the page box on the sheet PAGEREF _Toc497733453 \h 0
4.4 Running headers and Footers PAGEREF _Toc497733454 \h 0
4.5 Inline Image Data PAGEREF _Toc497733455 \h 0
4.5.1 Get PAGEREF _Toc497733456 \h 0
4.5.2 Inline PAGEREF _Toc497733457 \h 0
4.5.3 Push or Post PAGEREF _Toc497733458 \h 0
4.6 Side-by-Side Images PAGEREF _Toc497733459 \h 0
5 Conformance PAGEREF _Toc497733460 \h 0
5.1 XHTML Document Type Conformance PAGEREF _Toc497733461 \h 0
5.2 XHTML Document Conformance PAGEREF _Toc497733462 \h 0
5.3 Printer Conformance PAGEREF _Toc497733463 \h 0
5.3.1 Formatting/Rendering Rules PAGEREF _Toc497733464 \h 0
5.3.2 Printer Conformance PAGEREF _Toc497733465 \h 0
5.4 Photo-Imaging Extension PAGEREF _Toc497733466 \h 0
6 Work Items PAGEREF _Toc497733467 \h 0
Overview
This section is informative.
This document is intended to specify a simple XHTML based data stream suitable for printing as well as display. It is largely based on the W3Cs XHTML Basic. Its targeted usage is for printing in environments with lightweight and other simple clients that do not have the ability to install a printer specific driver. Throughout this document this data stream is called XHTML-Print.
XHTML-Print is not to be used when strict layout consistency and repeatability are required. The design goal of XHTML-Print is to provide a relatively simple, broadly supportable print datastream where content preservation and reproduction are the goal, i.e. Content is King. More traditional printer datstreams such as PostScript or PCL are more suitable when strict layout control is required.
This document creates a set of conformance criteria for XHTML-Print. It includes style sheet constructs drawn from CSS1/CSS2 and proposed for CSS3 to provide a strong basis for rich printing results without a detailed understanding of each individual printers characteristics. It also defines conformance criteria for an optional extension set targeted at photo printing, the XHTML-Print Photo-Imaging Extension.
References
This section is informative.
The following definitions and references are used throughout the document.
XHTML( 1.0: The Extensible HyperText Markup Language. A reformulation of HTML 4.0 as an XML application. See HYPERLINK http://www.w3.org/TR/xhtml1 http://www.w3.org/TR/xhtml1
XHTML( Basic: A subset of XHTML 1.0 that includes a reduced set of functions. See HYPERLINK http://www.w3.org/TR/xhtml-basic http://www.w3.org/TR/xhtml-basic
Modularization of XHTML(: A document which an abstract modularization of XHTML and an implementation of the abstraction using XML Document Type Definitions (DTDs). This modularization provides a means for subsetting and extending XHTML, a feature needed for extending XHTML's reach onto emerging platforms. See HYPERLINK http://www.w3.org/TR/xhtml-modularization http://www.w3.org/TR/xhtml-modularization
XML 1.0: The Extensible Markup Language (XML) is a subset of SGML that is to be served, received, and processed on the Web in the way that is not possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML. See HYPERLINK http://www.w3.org/TR/REC-xml http://www.w3.org/TR/REC-xml
CSS 1.0: Cascading Style Sheets version 1.0 is a simple style sheet mechanism that allows authors and readers to attach style (e.g. fonts, colors and spacing) to HTML documents. See HYPERLINK http://www.w3.org/TR/REC-CSS1 http://www.w3.org/TR/REC-CSS1
CSS 2.0: Cascading Style Sheets version 2.0 build on CSS1 and, with very few exceptions, all valid CSS1 style sheets are valid CSS2 style sheets. CSS2 supports media-specific style sheets so that authors may tailor the presentation of their documents to visual browsers, aural devices, printers, braille devices, handheld devices, etc. CSS2 also adds content positioning, downloadable fonts, table layout, features for internationalization, automatic counters and numbering, and some properties related to user interface. See HYPERLINK http://www.w3.org/TR/REC-CSS2/ http://www.w3.org/TR/REC-CSS2/
"Namespaces in XML", T. Bray, D. Hollander, A. Layman, 14 January 1999. XML namespaces provide a simple method for qualifying names used in XML documents by associating them with namespaces identified by URI. Available at: HYPERLINK "http://www.w3.org/TR/REC-xml-names" http://www.w3.org/TR/REC-xml-names.
Simple Object Access Protocol (SOAP) 1.1 SOAP is a lightweight XML-based protocol for exchange of information in a decentralized, distributed environment. It is a submission to the W3C and is available at HYPERLINK "http://www.w3.org/TR/SOAP" http://www.w3.org/TR/SOAP.
Introduction to CSS3 is an introduction and roadmap of the work in progress on CSS level 3. At this time, it is available to W3C members only from: HYPERLINK http://www.w3.org/Style/Group/css3-src/css3-roadmap/ http://www.w3.org/Style/Group/css3-src/css3-roadmap/
RFC 2045 - Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies, N. Freed, N. Borenstein, Section 6.8. Base64 Content-Transfer-Encoding is of particular interest It is available at HYPERLINK "http://info.internet.isi.edu/in-notes/rfc/files/rfc2045.txt" http://info.internet.isi.edu/in-notes/rfc/files/rfc2045.txt
XML Schema Part 2: Datatypes W3C Working Draft 22 September 2000, P. Biron, A Malhotra, available at HYPERLINK "http://www.w3.org/TR/xmlschema-2/" http://www.w3.org/TR/xmlschema-2/
JPEG File Interchange Format, version 1.02, September 1, 1992, C-Cube Microsystems. Available from HYPERLINK ftp://ftp.uu.net/graphics/jpeg/jfif.ps.gz ftp://ftp.uu.net/graphics/jpeg/jfif.ps.gz
XHTML-Print Tags
This section is normative.
Tags Required by XHTML-Print
The World Wide Web Consortium has defined a subset of XHTML 1.0 that is targeted to small format devices such as PDAs and cellular telephones. The definition of XHTML-Basic is therefore useful to be examined as a starting point for the definition of XHTML-Print. XHTML-Print is a proper superset of XHTML Basic in the set of tags required.
The Modularization of XHTML [Reference REF _Ref491065196 \r \h 3 on page PAGEREF _Ref491065196 \h 0] is a decomposition of XHTML 1.0 and by reference HTML 4.0 into a collection of abstract modules that provide specific types of functionality. XHTML-Print is defined, in part, by inclusion of a set of these modules. As such, the non-CSS portion of XHTML-Print includes the following modules:
Basic Modules
Structure Module
body, head, html, title
Basic Text Module
abbr, acronym, address, blockquote, br, cite, code, dfn, div, em,
h1, h2, h3, h4, h5, h6, kbd, p, pre, q, samp, span, strong, var
Text Extension
b, big, hr, i, small, sub, sup, tt
Hypertext Module
a
List Module
dl, dt, dd, ol, ul, li
Table Modules
Basic Table Module
caption, table, td, th, tr
Image Module
img
Basic Forms Module
form, input, label, select, option, textarea
MetaInformation Module
meta
Stylesheet Module
style
Link Module
link
Base Module
base
Object Module
object
Restrictions on XHTML by XHTML-Print
XHTML-Print also restricts some usage of common XTHML 1.0 tags in the same way that XHTML Basic does:
Nesting of Tables is NOT supported
Frames are NOT supported
CSS Conformance
See section REF _Ref496674025 \h Printer Conformance, section REF _Ref496674034 \r \h 5.3.2 for CSS conformance requirements for XHTML-Print conforming implementations.
Changes to XHTML for XHTML-Print
This section is normative.
XHTML-Print inherits all the structure, encoding and other basic infrastructure specified by XHTML. The following functions are added, have usage restrictions or must be changed to meet the needs of printings.
Recommended attributes on the and tags
Because many printers create the page in a serial manner from top to bottom, it is important for the printer to know the size of images before retrieving the image data itself. This information is then used to create portions of the page layout.
Therefore, the sender is strongly encouraged to include the height and width attributes either within the or the tag, or within an associated style sheet rule. These attributes may be expressed as percentages within the or the tag, or may use the standard absolute or relative units within the CSS rule. Percentages are relative to the parent element and not the page width or printable area. Pixel units should be avoided, because the resultant size may vary markedly, depending on the native resolution of the printer or displaying device.
This document specifies only one mandatory image format, baseline JPEG as defined in REF _Ref497554837 \r \h 12 on page PAGEREF _Ref497554853 \h 0. Printers are not required to support:
Embedded Thumbnails
Rotation
Progressive rendering
within the JFIF files.
Page Breaks
Because of the differences in displaying in a browser versus a paged media device like a printer, the data source may want to have control of the locations of page breaks. Therefore, a single means is needed to cause a page-break to occur.
Page Break using a CSS2 Method
Testing Page Breaks
Page 1 Text
Page 2 Text
Conforming implementations shall support the CSS2 page-break properties. Usage of this property is allowed with as well as all other appropriate tags, e.g. ,
,
, , etc. Conforming XHTML-Print printers shall support both in-line and referenced style sheets.
Page Size and Orientation
(The following is a summary and partial extraction from the CSS2 document)
Page size and orientation are controlled using the @page rules from CSS2. Specifically, the size property is applied to @page to set both size and orientation. The margin properties (margin-top, margin-bottom, margin-left, margin-right, and margin) as defined by CSS2 also apply within the page context. Page size and orientation provided in the XHTML-Print datastream will override similar attributes contained within the commands and/or attributes provided when creating the print job itself.
Size Property
Usage:
@page {
size: auto; /* auto is the initial value and uses the default paper */
}
'size'
Value: {1,2} | auto | portrait | landscape | inherit
Initial: auto
Applies to: the page context
Inherited: N/A
Percentages: N/A
Media: paged
This property specifies the size and orientation of a page box. The size of a page box may either be "absolute" (fixed size) or "relative" (scalable, i.e., fitting available sheet sizes). Relative page boxes allow printers to scale a document and make optimal use of the target size. Three values for the 'size' property create a relative page box:
auto: The page box will be set to the size and orientation of the target sheet.
landscape: Overrides the target's orientation. The page box is the same size as the target, and the longer sides are horizontal.
portrait: Overrides the target's orientation. The page box is the same size as the target, and the shorter sides are horizontal.
Margin Property
The margin property is supported as specified in CSS2, clause 8.3.
Examples
In the following example, the outer edges of the page box will align with the target. The percentage value on the 'margin' property is relative to the target size so if the target sheet dimensions are 21.0cm x 29.7cm (i.e., A4), the margins are 2.10cm and 2.97cm.
@page {
size: auto; /* auto is the initial value */
margin: 10%;
}
Length values for the 'size' property create an absolute page box. If only one length value is specified, it sets both the width and height of the page box (i.e., the box is a square). Since the page box is the initial containing block, percentage values are not allowed for the 'size' property.
For example:
@page {
size: 8.5in 11in portrait; /* width height */
}
The above example set the width of the page box to be 8.5in and the height to be 11in. The page box in this example requires a target sheet size of 8.5"x11" or larger. Printers may allow users to control the transfer of the page box to the sheet (e.g., rotating an absolute page box that's being printed).
Rendering page boxes that do not fit a target sheet
If a page box does not fit the target sheet dimensions, the printer may choose (in order of preference) to:
Rotate the page box 90 if this will make the page box fit.
Scale the page to fit the target.
Reformat the page (including spilling onto another sheet)
Clip (least preferred)
The printer may consult the user before performing these operations. Lacking accessto the user, it may simply make a decision on its own.
Positioning the page box on the sheet
When the page box is smaller than the target size, the user agent is free to place the page box anywhere on the sheet. However, it is recommended that the page box be centered on the sheet since this will align double-sided pages and avoid accidental loss of information that is printed near the edge of the sheet.
Running headers and Footers
A means is needed to create a running-header and a running-footer on the printed page. Current work in progress by the W3C on paged media defines a very robust method for adding margin boxes to the top, bottom, left and right of the page. A reduced set from the CSS3 proposal is employed, using top and bottom margin boxes via the @page rules method. N.B. CSS3 progress should be tracked and mirrored for this feature.
Utilizing the terminology of CSS2 and CSS3, a margin box is defined in conjunction with the page box and page area (as shown in REF _Ref491771950 \h Figure 1) to create an area into which running-header and running-footer text can be inserted.
CSS3 proposes the ability to left-align, right-align and center the text horizontally as well as methods to top-align, bottom-align and center the text vertically within the margin boxes. For XHTML-Print conforming implementations vertical controls are not required to be supported. Instead, for XHTML-Print conforming implementations, the running-header text may be top aligned in the margin box and the running-footer text may be bottom aligned in the margin box.
CSS3 proposes methods for the printing device to automatically include:
page number
total pages in the document
date
time
file name
into the running-header and running-footer. XHTML-Print conforming implementations are only required to support inserting a page number using the counter(pages) object. If required, the sending appliance must provide the other information within the text string to be printed in the margin box.
Figure SEQ Figure \* ARABIC 1
The following are sample XHTML/CSS fragments used to create running-headers and running-footers.
The above example creates a running header that is left aligned at 150% of normal font size and bold in Helvetica, Arial or the default san-serif font whichever is available.
The above example creates a running footer such as Page 14 centered on the page in a font 80% of normal size in Times, Palatino or the default serif font whichever is available.
Inline Image Data
In web-based applications of XHTML, image data is contained in a separate file on the web server that the user agent retrieves. Some low cost, resource constrained clients may want to include images in their print output but cannot afford to include a server. In this case, the image data must be contained in the XHTML-Print file sent to the printer.
This inline image data is treated as xsd:binary data type from the XML schema [Reference 11 on page PAGEREF _Ref497554853 \h 0]. That is, the image data is treated as a sequence of binary octets that have been encoded to allow inclusion within the XHTML-Print document. Although both hex and base 64 [Reference 10 on page PAGEREF _Ref497731852 \h 0] encoding are allowed for xsd:binary, XHTML-Print will restrict the encoding to base64.
In XHTML-Print, we are dealing with a stream of base64-encoded bytes traveling over a transport capable of binary data transfers instead of in an Internet Mail message body. Consequently, there are no restrictions placed on line lengths. There is no need for special boundary delimiters since the < and > characters are not valid characters in the base64 encoding. Inline image transfer is not a recommended practice especially for large images as the encoding inefficiencies of base64 can cause a 33% increase in file size.
The image data included in an XHTML document may take one of three forms.
Get
The first form uses the data attribute of the object element or the src attribute of the img element to reference the URL of the image file that is stored external to the printer. This requires a server on the client capable of delivering the file. (Although the img element has not been deprecated with version XHTML 1.0, the clear preference and direction for the future is for the more flexible object form.)
Or alternately:
Inline
The second method to inline image data in XHTML-Print is via a forward reference. The declare attribute of the object element is used to define the object, but delay its processing. The id attribute is used to associate the forward reference with the image content, sent at the end of the XHTML-Print document.
. . . .
This method may be useful for very simple clients which cannot afford a server for image download but it is not recommended for general use. If the content of the XHTML-Print document spans multiple pages, the printer may not be able to buffer the pages described between the image declaration and the image data description.
(Note: This method may be replaced or augmented by a methodology that uses multi-part MIME related packaging of both the XHTML-Print and the images. The sender may break the XHTML-Print content into chucks in order to place the image data in-line. The printer is to reassemble these XHTML-Print chunks by concatenating them together. Further details may be provided in a subsequent version of this document.)
Side-by-Side Images
Low-cost printers today often have very little memory into which page data can be stored before being printed. As such, they must build and print the page in swaths on the fly from the top of the page to the bottom. To enable the use of XHTML-Print in these low cost printers, some restrictions on the order of images contained in the XHTML-Print data stream must be added.
If two or more images will be even partially side-by-side on the printed page they should be included by reference ( or ) rather than included in-line. This allows the printer to get chunks of the image, as it needs it, as it prints down the page.
If the image data is included in-line, an XHTML-Print conforming printer may chose to not print one or more of the side-by-side images. Clients providing the images in-line should order them from left-to-right, top-to-bottom unless the print direction is known to be otherwise.
Conformance
This section is normative.
XHTML Document Type Conformance
It is possible to modify existing document types and define wholly new document types using both modules defined in this specification and other modules. Such a document type conforms to this specification when it meets the following criteria:
The document type must be defined using one of the implementation methods defined by the W3C (currently this is limited to XML DTDs, but XML Schema will be available soon).
The document type must have a unique identifier as defined in XHTML-Mod [see page PAGEREF _Ref491065196 \h 0] Naming Rules.
The document type must include, at a minimum, the Structure, Hypertext, Basic Text, and List modules defined in the XHTML-Mod specification.
For each of the W3C-defined modules that are included, all of the elements, attributes, and any required minimal content models must be included (and optionally extended) in the document types content model.
The document type may define additional elements and attributes. However, these must be in their own XML Namespace [see page PAGEREF _Ref493502546 \h 0].
XHTML Document Conformance
Documents that rely upon XHTML-family document types are considered XHTML conforming if they validate against their referenced document type.
Client Conformance
Clients shall produce a well-formed XHTML document as defined in REF _Ref498937914 \r \h 1 on page PAGEREF _Ref497554853 \h 3.
Beyond number 1 above, clients are required to use no more of the XHTML-Print tags or Style Sheet attributes than necessary to get the desired output.
Printer Conformance
Formatting/Rendering Rules
In order to be consistent with the XML 1.0 Recommendation [see page PAGEREF _Ref493502263 \h 0], the printer must parse and evaluate an XHTML document to determine if the document is well formed. If the printer claims to be a validating printer, it must also validate documents against their referenced DTDs according to XML. Validation is not required to claim conformance to this standard.
When the printer claims to support facilities defined within this specification or required by this specification through normative reference, it must do so in ways consistent with the facilities definition.
When a printer processes an XHTML document as generic XML, it shall only recognize attributes of type ID (e.g. the id attribute on most XHTML elements) as fragment identifiers.
Images:
If a printer encounters an image in a format it does not support, it will reserve the space specified by the height and width attributes optionally by drawing a box around this space of the size specified for the image.
If the image format is not supported and the height and width attributes were omitted, the image is omitted and no space is reserved.
If the image format is supported and the height and width attributes were omitted, the printer may choose to omit the image from the page.
If a printer encounters an element it does not recognize, it should render the elements content as if the element and its end tag were not present at all. Printers may chose not to render content within elements defined by XHTML, HTML or deprecated from HTML which is obviously not intended to be rendered, e.g.