ࡱ > Y c bjbjWW " = = T_ ; ] ` ` `
j 8 J D k n ( 0 0 0
! , M pk rk rk rk rk rk rk $ Km ?o k
k 0 0 n 8 0 0 pk v pk 6 d dk 0 T `T{5, j
XHTML(-Print
Draft 0.15
F. D. Wright
Director, Strategic and Technical Alliances
Lexmark International
HYPERLINK mailto:don@lexmark.com don@lexmark.com
Melinda Grant
Hewlett-Packard
HYPERLINK "mailto:melinda_grant@hp.com" melinda_grant@hp.com
SAVEDATE \@ "MMMM d, yyyy" \* MERGEFORMAT October 2, 2000
Table of Contents
TOC \o "1-6" 1 Overview PAGEREF _Toc493589969 \h 0
2 References PAGEREF _Toc493589970 \h 0
3 XHTML-Print Tags PAGEREF _Toc493589971 \h 0
3.1 Tags Required by XHTML-Print PAGEREF _Toc493589972 \h 0
3.2 Restrictions on XHTML by XHTML-Print PAGEREF _Toc493589973 \h 0
3.3 CSS Conformance PAGEREF _Toc493589974 \h 0
4 Changes to XHTML for XHTML-Print PAGEREF _Toc493589975 \h 0
4.1 Required attributes on the tag PAGEREF _Toc493589976 \h 0
4.2 Page Breaks PAGEREF _Toc493589977 \h 0
4.3 Running headers and Footers PAGEREF _Toc493589978 \h 0
4.4 Inline Image Data PAGEREF _Toc493589979 \h 0
4.5 Side-by-Side Images PAGEREF _Toc493589980 \h 0
5 Conformance PAGEREF _Toc493589981 \h 0
5.1 XHTML Document Type Conformance PAGEREF _Toc493589982 \h 0
5.2 XHTML Document Conformance PAGEREF _Toc493589983 \h 0
5.3 Printer Conformance PAGEREF _Toc493589984 \h 0
5.3.1 Formatting/Rendering Rules PAGEREF _Toc493589985 \h 0
5.3.2 Printer Conformance Levels PAGEREF _Toc493589986 \h 0
5.3.2.1 L0 Printer PAGEREF _Toc493589987 \h 0
5.3.2.2 L1 Printer PAGEREF _Toc493589988 \h 0
5.3.2.3 L2 Printer PAGEREF _Toc493589989 \h 0
6 Work Items PAGEREF _Toc493589990 \h 0
Overview
This section is informative.
This document is intended to specify a simple XHTML based datastream for printing. It is largely based largely on the W3Cs XHTML Basic. Its targeted usage is for printing in environments with lightweight and other simple clients that do not have the ability to install a printer specific driver. Throughout this document this datastream is called XHTML-Print.
It is the goal of this effort to create a single set of conformance criteria for XHTML-Print; however, that minimum level has not yet been defined. This document contains three possible conformance levels which are simply three possibilities for the eventual single conformance level. Alternatively, a level below, between or above these three levels could be selected. These three possible conformance criteria for XHTML-Print are:
Level 0 is a minimum set of tags and controls to provide basic printing capability. No support for CSS1 or CSS2 is provided. This is known throughout the document as XHTML-Print L0 or just L0.
Level 1 includes all of L0 and adds more functionality and provides more control over appearance of the printed page by including major components of CSS1 and parts of CSS2. This is known throughout the document as XHTML-Print L1 or just L1.
Level 2 includes all of L0 and L1 and adds even more functionality by including most of CSS1 and more of CSS2. This is known throughout the document as XHTML-Print L2 or just L2.
A printer which implements the full XHTML 1.0 set of tags and controls (i.e. those specified by the transitional or frameset DTDs) would be suitable for reference printing. This Advanced Printer is not defined specifically in this document.
References
This section is informative.
The following definitions and references are used throughout the document.
XHTML( 1.0: The Extensible HyperText Markup Language. A reformulation of HTML 4.0 as an XML application. See HYPERLINK http://www.w3.org/TR/xhtml1 http://www.w3.org/TR/xhtml1
XHTML( Basic: A subset of XHTML 1.0 that includes a reduced set of functions. See HYPERLINK http://www.w3.org/TR/xhtml-basic http://www.w3.org/TR/xhtml-basic
Modularization of XHTML(: A document which an abstract modularization of XHTML and an implementation of the abstraction using XML Document Type Definitions (DTDs). This modularization provides a means for subsetting and extending XHTML, a feature needed for extending XHTML's reach onto emerging platforms. See HYPERLINK http://www.w3.org/TR/xhtml-modularization http://www.w3.org/TR/xhtml-modularization
XML 1.0: The Extensible Markup Language (XML) is a subset of SGML that is to be served, received, and processed on the Web in the way that is not possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML. See HYPERLINK http://www.w3.org/TR/REC-xml http://www.w3.org/TR/REC-xml
CSS 1.0: Cascading Style Sheets version 1.0 is a simple style sheet mechanism that allows authors and readers to attach style (e.g. fonts, colors and spacing) to HTML documents. See HYPERLINK http://www.w3.org/TR/REC-CSS1 http://www.w3.org/TR/REC-CSS1
CSS 2.0: Cascading Style Sheets version 2.0 build on CSS1 and, with very few exceptions, all valid CSS1 style sheets are valid CSS2 style sheets. CSS2 supports media-specific style sheets so that authors may tailor the presentation of their documents to visual browsers, aural devices, printers, braille devices, handheld devices, etc. CSS2 also adds content positioning, downloadable fonts, table layout, features for internationalization, automatic counters and numbering, and some properties related to user interface. See HYPERLINK http://www.w3.org/TR/REC-CSS2/ http://www.w3.org/TR/REC-CSS2/
"Namespaces in XML", T. Bray, D. Hollander, A. Layman, 14 January 1999. XML namespaces provide a simple method for qualifying names used in XML documents by associating them with namespaces identified by URI. Available at: HYPERLINK "http://www.w3.org/TR/REC-xml-names" http://www.w3.org/TR/REC-xml-names.
Simple Object Access Protocol (SOAP) 1.1 SOAP is a lightweight XML-based protocol for exchange of information in a decentralized, distributed environment. It is a submission to the W3C and is available at HYPERLINK "http://www.w3.org/TR/SOAP" http://www.w3.org/TR/SOAP.
Introduction to CSS3 is an introduction and roadmap of the work in progress on CSS level 3. At this time, it is available to W3C members only from: HYPERLINK http://www.w3.org/Style/Group/css3-src/css3-roadmap/ http://www.w3.org/Style/Group/css3-src/css3-roadmap/
XHTML-Print Tags
This section is normative.
Tags Required by XHTML-Print
The World Wide Web Consortium has defined a subset of XHTML 1.0 that is targeted to small format devices such as PDAs and cellular telephones. The definition of XHTML-Basic is therefore useful to be examined as a starting point for the definition of XHTML-Print. XHTML-Print is very similar to XHTML Basic in the set of tags required.
The Modularization of XHTML [Reference REF _Ref491065196 \r \h 3 on page PAGEREF _Ref491065196 \h 0] is a decomposition of XHTML 1.0 and by reference HTML 4.0 into a collection of abstract modules that provide specific types of functionality. XHTML-Print is defined, in part, by inclusion of a set of these modules. As such, the non-CSS portion of XHTML-Print includes the following modules:
Basic Modules
Structure Module
body, head, html, title
Basic Text Module
abbr, acronym, address, blockquote, br, cite, code, dfn, div, em,
h1, h2, h3, h4, h5, h6, kbd, p, pre, q, samp, span, strong, var
Hypertext Module
a
List Module
dl, dt, dd, ol, ul, li
Table Modules
Basic Table Module
caption, table, td, th, tr
Image Module
img
Basic Forms Module (except L0)
form, input, label, select, option, textarea
MetaInformation Module
meta
Stylesheet Module (except L0)
style
Link Module
link
Base Module
base
Restrictions on XHTML by XHTML-Print
XHTML-Print also restricts some usage of common XTHML 1.0 tags:
Nesting of Tables is NOT supported
Frames are NOT supported
CSS Conformance
See section REF _Ref493587934 \r \h 5.3.2 REF _Ref493587918 \h Printer Conformance Levels on page PAGEREF _Ref493587926 \h 0 for CSS conformance requirements for XHTML-Print conforming devices.
Changes to XHTML for XHTML-Print
This section is normative.
XHTML-Print inherits all the structure, encoding and other basic infrastructure specified by XHTML. The following functions are added, have usage restrictions or must changed to meet the needs of printings.
Recommended attributes on the tag
Because many printers create the page in a serial manner from top to bottom, it is important for the printer to know the size of images before retrieving the image data itself. This information is then used to create portions of the page layout.
Therefore, the sender is strongly encouraged to include the height and width attributes with the tag. These attributes shall be expressed in units of TBD.
This document does not define what image formats must be supported.
Page Breaks
Because of the differences in displaying in a browser versus a paged media device like a printer, the data source may want to have control of the locations of page breaks.
Therefore, a single means is needed to cause a page-break to occur.
Page Break using a CSS2-like Method
Testing Page Breaks
Page 1 Text
Page 2 Text
L0 conforming printers shall support this means by inclusion of the pre-defined (implicit) class .pagebreak as shown explicitly defined above. L0 printers are only required to support this class when used with
as shown above. L1 and L2 Printers shall allow usage of this class with all other appropriate tags, e.g. ,
,
, , etc.
Running headers and Footers
Except for the L0 conforming printer, a means is needed to create a running-header and a running-footer on the printed page. Current work in progress by the W3C on paged media defines a very robust method for adding margin boxes to the top, bottom, left and right of the page. For the L1 and L2 conforming printers, a reduced set from CSS3 using top and bottom margin boxes is proposed utilizing the same @page rules method.
Utilizing the terminology of CSS2 and CSS3, a margin box is defined in conjunction with the page box and page area (as shown in the REF _Ref491771950 \h Figure 1) to create an area into which running-header and running-footer text can be inserted.
CSS3 proposes the ability to left-align, right-align and center the text horizontally as well as methods to top-align, bottom-align and center the text vertically within the margin boxes. For XHTML-Print L1 conforming printers, vertical controls are required to be supported. Instead, for XHTML-Print L1 conforming printers, the running-header text shall be top aligned in the margin box and the running-footer text shall be bottom aligned in the margin box.
CSS3 proposes methods for the printing device to automatically include:
page number
total pages in the document
date
time
file name
into the running-header and running-footer. XHTML-Print L1 and L2 conforming printers are only required to support inserting a page number using the counter(pages) object. If required, the sending appliance must provide the other information within the text string to be printed in the margin box.
Figure SEQ Figure \* ARABIC 1
The following are sample XHTML/CSS fragments used to create running-headers and running-footers.
The above example creates a running header that is left aligned at 150% of normal font size and bold in Helvetica, Arial or the default san-serif font whichever is available.
The above example creates a running footer such as Page 14 centered on the page in a font 80% of normal size in Times, Palatino or the default serif font whichever is available.
Inline Image Data
In web-based applications of XHTML, image data is contained in a separate file on the web server that the user agent retrieves. Some low cost, resource constrained clients may want to include images in their print output but cannot afford to include a server. In this case, the image data must be contained in the XHTML file sent to the printer. To accomplish this, the image data shall be provided immediately following the data using base64 encoding in a manner similar to SOAPs (see page PAGEREF _Ref493587370 \h 0) Array of Bytes type. This is not a recommended practice especially for large images as the encoding efficiencies of base64 can cause a 33% increase in file size.
{details need work; this still has SOAP stuff in it which needs to be modified}
An array of bytes representing an image shall be encoded as a single-reference. The rules for an array of bytes are similar to those for a string. In particular, the containing element of the array of bytes value shall have an "id" attribute. Additional elements may access the content, e.g. , and shall have matching "href" attributes. The recommended representation of an opaque array of bytes is the 'base64' encoding defined in XML Schemas, which uses the base64 encoding algorithm defined in RFC2045. However, the line length restrictions that normally apply to base64 data in MIME do not apply in SOAP. A "SOAP-ENC:base64" subtype is supplied for use with SOAP.
Side-by-Side Images
Low-cost printers today often have very little memory into which page data can be stored before being printed. As such, they must build and print the page in swaths on the fly from the top of the page to the bottom. To enable the use of XHTML-Print in these low cost printers, some restrictions on the order of images contained in the XHTML-Print datastream must be added.
If two or more images will be even partially side-by-side on the printed page they should be include by reference ( ) rather than included in-line. This allows the printer to get chunks of the image, as it needs it, as it prints down the page.
If the image data is included in-line, an XHTML-Print conforming printer may chose to not print one or more of the side-by-side images.
Conformance
This section is normative.
XHTML Document Type Conformance
It is possible to modify existing document types and define wholly new document types using both modules defined in this specification and other modules. Such a document type conforms to this specification when it meets the following criteria:
The document type must be defined using one of the implementation methods defined by the W3C (currently this is limited to XML DTDs, but XML Schema will be available soon).
The document type must have a unique identifier as defined in XHTML-Mod [see page PAGEREF _Ref491065196 \h 0] Naming Rules.
The document type must include, at a minimum, the Structure, Hypertext, Basic Text, and List modules defined in the XHTML-Mod specification.
For each of the W3C-defined modules that are included, all of the elements, attributes, and any required minimal content models must be included (and optionally extended) in the document types content model.
The document type may define additional elements and attributes. However, these must be in their own XML Namespace [see page PAGEREF _Ref493502546 \h 0].
XHTML Document Conformance
Documents that rely upon XHTML-family document types are considered XHTML conforming if they validate against their referenced document type.
Printer Conformance
Formatting/Rendering Rules
In order to be consistent with the XML 1.0 Recommendation [see page PAGEREF _Ref493502263 \h 0], the printer must parse and evaluate an XHTML document to determine if the document is well formed. If the printer claims to be a validating printer, it must also validate documents against their referenced DTDs according to XML.
When the printer claims to support facilities defined within this specification or required by this specification through normative reference, it must do so in ways consistent with the facilities definition.
When a printer processes an XHTML document as generic XML, it shall only recognize attributes of type ID (e.g. the id attribute on most XHTML elements) as fragment identifiers.
Images:
If a printer encounters an image in a format it does not support, it will reserve the space specified by the height and width attributes optionally by drawing a box around this space of the size specified for the image.
If the image format is not supported and the height and width attributes were omitted, the image is omitted and no space is reserved.
If the image format is supported and the height and width attributes were omitted, the printer may chose to omit the image from the page.
If a printer encounters an element it does not recognize, it must render the elements content as if the element and its end tag were not present at all.
If a printer encounters an attribute it does not recognize, it must ignore the entire attribute specification (i.e., the attribute and its value).
If a printer encounters an attribute value it doesnt recognize, it must use the default attribute value.
If a printer encounters an entity reference (other than one of the predefined entities) for which the Printer has processed no declaration (which could happen if the declaration is in the external subset which the Printer hasnt read), the entity reference should be rendered as the characters (starting with the ampersand and ending with the semi-colon) that make up the entity reference.
When rendering content, printers that encounter characters or character entity references that are recognized but not renderable should display the document in such a way that it is obvious to the user that normal rendering has not taken place.
The following characters are defined in XML as whitespace characters: Space ( ) Tab ( ) Carriage return (
) Line feed (
). The XML processor normalizes different systems line end codes into one single line-feed character, that is passed up to the application. The XHTML printer in addition, must treat the following characters as whitespace: Form feed () Zero-width space () In elements where the xml:space attribute is set to preserve, the printer must leave all whitespace characters intact (with the exception of leading and trailing whitespace characters, which should be removed). Otherwise, whitespace is handled according to the following rules:
All whitespace surrounding block elements should be removed.
Comments are removed entirely and do not affect whitespace handling. One whitespace character on either side of a comment is treated as two white space characters.
Leading and trailing whitespace inside a block element must be removed.
Line feed characters within a block element must be converted into a space (except when the xml:space attribute is set to preserve).
A sequence of white space characters must be reduced to a single space character (except when the xml:space attribute is set to preserve).
With regard to rendition, the printer should render the content in a manner appropriate to the language in which the content is written.
In languages whose primary script is Latinate, the ASCII space character is typically used to encode both grammatical word boundaries and typographic whitespace.
In languages whose script is related to Nagari (e.g., Sanskrit, Thai, etc.), grammatical boundaries may be encoded using the ZW space character, but will not typically be represented by typographic whitespace in rendered output.
Languages using Arabiform scripts may encode typographic whitespace using a space character, but may also use the ZW space character to delimit internal grammatical boundaries (what look like words in Arabic to an English eye frequently encode several words, e.g. kitAbuhum = kitAbu-hum = book them == their book).
Languages in the Chinese script tradition typically neither encode such delimiters nor use typographic whitespace in this way. Whitespace in attribute values is processed according to [XML].
Printer Conformance Levels
L0 Printer
An L0 conforming printer shall support all XHTML Modules listed in clause REF _Ref493493291 \r \h 3.1 on page PAGEREF _Ref493493322 \h 0 with the exception of the Basic Forms and Stylesheets.
An L0 conforming printer shall support the implicit .pagebreak class and allow its application to the
tag.
With the exception of the implicit .pagebreak class, an L0 conforming printer is not required to support stylesheets or CSS in any way.
L1 Printer
In addition to the L0 conformance requirements:
An L1 conforming printer shall support all XHTML Modules listed in clause REF _Ref493493291 \r \h 3.1 on page PAGEREF _Ref493493322 \h 0.
An L1 conforming printer shall allow the application of the .pagebreak class to other appropriate tags.
An L1 conforming printer shall print a static version of a form using default values as specified in the form.
For an L1 printer, the following CSS1 constructs shall be supported:
Block item properties:
Font (font-family, font-style, font-weight, font-size)
Color (both names and RGB value)
Text decoration (underline, overline, linethrough)
Text align (left, right, center, justify)
Text indent
Line Height
Box Properties
none
Classification Properties
White-space
List-style-type
List-style-position
Units
em
ex
pixel (adjust from screen to printer resolution?)
percent
For an L1 printer, the following CSS2 constructs shall be supported:
@media print
@page rules
size
:left
:right
:first
crop is NOT supported
named pages for content placement control is NOT supported
Page Break Properties as applied to
tags:
Page-break-before
Page-break-after
Page-break-inside
L2 Printer
In addition to the L1 conformance requirements:
No additional XHTML modules are required
For an L2 conforming printer, the following CSS1 constructs shall be supported:
Block item properties:
Font (font-family, font-style, font-variant, font-weight, font-size)
Color (both names and RGB value)
Background (image, background-attachment, background-position)
Text (word-spacing, letter-spacing, vertical-align, text-transform)
Text decoration (underline, overline, linethrough)
Text align (left, right, center, justify)
Text indent
Line Height
Box Properties
margin-top, margin-bottom, margin-right, margin-left, margin
padding-top, padding-bottom, padding-right, padding-left, padding
border-top-width, border-bottom-width, border-right-width, border-left-width, border-width
border-top-color, border-bottom-color, border-right-color, border-left-color, border-color
border-top-style, border-bottom-style, border-right-style, border-left-style, border-style
border-top, border-bottom, border-right, border-left, border
Classification Properties
White-space
List-style-type
List-style-image
List-style-position
Units
em
ex
pixel (adjust from screen to printer resolution?)
percent
For an L2 conforming printer, the following CSS2 constructs shall be supported:
@media print
@page rules
size
:left
:right
:first
crop is NOT supported
named pages for content placement control is NOT supported
Page Break Properties as applied to block elements (e.g. ,
, etc.)
Page-break-before
Page-break-after
Page-break-inside
Absolute positioning controls
Work Items
This section is informative, temporary and will be removed upon completion.
Creation of an XHTML-Print Doctype
Creation of an XHTML-Print DTD
Specific behaviors for some markup lacking print specificity
Define how in-line images are included? (Multipart MIME encoding?)
( XHTML is a trademark of the World Wide Web Consortium.
PAGE 1
PAGE 1
$ I J Y Z ^ p q
, - G H I J K u v عدإ j UmH jB UmH j UmH jH UmH
j UmH >*CJ mH j U0J j U j U5 j 0J CJ$ CJ$ >
$ % & ' ( ) * + , - . / 0 1 2 3 4 $
$ % & ' ( ) * + , - . / 0 1 2 3 4 5 B n [ \ ^ p L U 3 a ]
7
d
n
o
y z p q :
: gh
N4 5 B n [ \ ^ p L U 3
!
-! $ $
5 6 P Q R S T e f . / 0 1 2 A B \ ] ^ _ ` j$ UmH j UmH j* UmH j UmH j0 UmH j UmH j6 UmH j UmH mH
j UmH j< UmH =3 a ]
7
d
n
o
y z p q ( )
&