[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: DocBook V5.0 spec
At long last, here's a draft of the DocBook V5.0 specification. Comments most welcome.Title: The DocBook Schema
Working Draft 5.0a1, 29 June 2005
Copyright © 2001, 2002, 2003, 2004, 2005 The Organization for the Advancement of Structured Information Standards [OASIS]. All Rights Reserved.
Table of Contents
DocBook is general purpose XML schema particularly well suited to books and papers about computer hardware and software (though it is by no means limited to these applications).
The DocBook Technical Committee maintains the DocBook schema. Starting with V5.0, DocBook is normatively available as a [RELAX NG] Schema. W3C XML Schema and Document Type Definition (DTD) versions are also available.
The Version 5.0 release is a complete rewrite. In programming-language terms, think of it as a code refactoring.
This rewrite introduces a large number of backwards-incompatible changes. Essentially all DocBook V4.x documents will have to be modified to validate against DocBook V5.0. An XSLT 1.0 stylesheet is provided to ease this transition.
The DocBook Technical Committee welcomes bug reports and requests for enhancement (RFEs) from the user community. The current list of outstanding requests is available through the SourceForge tracker interface. This is also the preferred mechanism for submitting new requests. Old RFEs, from a previous legacy tracking system, are archived for reference.
The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this Working Draft are to be interpreted as described in [RFC 2119]. Note that for reasons of style, these words are not capitalized in this document.
In V5.0, DocBook has been rewritten as a native RELAX NG grammar. The goals of this redesign were to produce a schema that:
Under the ordinary operating rules of DocBook evolution, the only backwards incompatible changes that could be made in DocBook V5.0 were those announced in DocBook V4.0. In light of the fact that this is a complete rewrite, the Technical Committee gave itself the freedom to make "unannounced" backwards-incompatible changes for this one release.
A number of elements have been removed from DocBook. Many of these have been replaced by simpler, more versatile alternatives. Others have simply been removed because they are not believed to be widely used.
Table 1. DocBook Element Changes
The content models of many inlines have been reduced, sometimes drastically. The parameter entity customization of DocBook V4.x and previous versions resulted in very broad content models for some inlines.
Consider, for example,
command ::= (#PCDATA|link|olink|ulink|action|application|classname|methodname| interfacename|exceptionname|ooclass|oointerface|ooexception| command|computeroutput|database|email|envar|errorcode|errorname| errortype|errortext|filename|function|guibutton|guiicon|guilabel| guimenu|guimenuitem|guisubmenu|hardware|interface|keycap|keycode| keycombo|keysym|literal|code|constant|markup|medialabel| menuchoice|mousebutton|option|optional|parameter|prompt|property| replaceable|returnvalue|sgmltag|structfield|structname|symbol| systemitem|uri|token|type|userinput|varname|nonterminal|anchor| remark|subscript|superscript|inlinegraphic|inlinemediaobject| indexterm|beginpage)*
In DocBook V5.0,
command ::= * Zero or more of: o text o alt o anchor o annotation o biblioref o indexterm o inlinemediaobject o link o phrase o remark o replaceable o subscript o superscript o xref
DocBook V5.0 may be overzealous in its simplification of content models. The Technical Committee expects to adjust these simplifications during user testing. Users are encouraged to report places where formally valid documents can no longer be made valid because content models have been reduced.
DocBook V4.x has
There’s an historical reason for the large number of unique names: customizers might very well want to adjust the content models of info elements at different levels. For example, a copyright statement might be required at the book level, or an an author forbidden at the sub-section level. In DTDs, there’s only one content model allowed per element name, so in order to support independent customization, each info element must have a different name.
In RELAX NG, no such limitation exists. We can use patterns to achieve both a single
DocBook V5.0 enforces the constraint that titles are required on
In DocBook V4.x and earlier, the presence of a document type declaration served as a mechanism for identifying the DocBook version of a document. Although the declaration was not actually required, it was present in the vast majority of DocBook documents.
In RELAX NG, no similar declaration exists. Although a document type declaration might still be present, it seems likely that this will not usually be the case.
Nevertheless, downstream processors may benefit from some indication of the version of DocBook being used. As a result DocBook V5.0 adds a new
Mixing versions is explicitly allowed and the version attribute may be used on other elements as well. This might be the case, for example, in a compound document constructed from multiple documents each with its own version.
DocBook V5.0 enforces attribute co-constraints such as the
In DocBook V5.0, HTML tables and CALS tables are independently specified. Where the DTD of DocBook V4.x allows for incoherent mixing of the two models, DocBook V5.0 forbids such mixtures.
DocBook V5.0 adds a few simple data types. For example, the
Some of these constraints, such as the requirement that elements like
Starting with DocBook V5.0, the
Support for extended links are provided through the
Accessibility is improved by allowing both inline and block annotations in most context. The
The DocBook V4.x markup for Tables of Contents, or more generally for Lists of Titles, was complex and had not evolved quite in step with the rest of DocBook. In DocBook V5.0, it has all been replaced by a quite simple, recursive
While most Tables of Contents and Lists of Titles are generated automatically and authors never have to produce markup for them by hand, this simplified content model should make it easier for authors to generate when necessary. One possible application of hand-authored
Grammar based validation technologies (like RELAX NG) and rule based validation technologies (like Schematron) are naturally complementary. Mixing them allows us to play to the strengths of each without stretching either to enforce constraints that they aren’t readily designed to enforce.
For example, DocBook NG requires that the root element of a document have an explicit version attribute. Because there are a great many elements that can be root elements in DocBook, and because they can almost all appear as descendants of a root element as well, it would be tedious to express this constraint in RELAX NG. But it is easy in a rule-based schema language.
DocBook V5.0 uses Schematron where appropriate.
From the very beginning, one of the goals of DocBook has been that users should be able to produce customizations that are either subsets of extensions of DocBook.
For users familiar with the intricacies of XML DTD syntax and the rather complex and highly stylized patterns of parameter entity usage in DocBook, this is possible in DocBook V4.x.
In DocBook V5.0, we hope to take advantage of RELAX NGs more robust design (and it's lack of pernicious determinism rules) to make customization easier.
Three schema design patterns get us most of the way there.
DocBook elements, particularly the inlines, can be divided into broad classes: general purpose, technical, error-related, operating-system related, bibliographic, publishing, etc. In DocBook V5.0, these are collected together in named patterns.
To add a new inline,
db.technical.inlines |= endpoint db.programming.inlines |= endpoint db.os.inlines |= endpoint
Much the same concept was used in DocBook V4.x, where instead of patterns we had parameter entities. However, the constraints of DTD validation severely limit the circumstances under which an element can appear twice in a content model. That meant that adding an element to one parameter entity might make it an error to add it to another. Such constraints do not exist in RELAX NG which greatly simplifies the customization.
Each element in DocBook V5.0 is defined by its own pattern. To change the content model of an element, only that pattern need be redefined. To remove an element from DocBook, that pattern can be redefined as "
There’s an XSLT 1.0 stylesheet for performing conversion from DocBook V4.x to DocBook V5.0. Presented with a valid DocBook V4.x document, it attempts to produce a valid DocBook V5.0 document.
It succeeds entirely automatically for the most part, though human intervention is suggested for constructs that might have multiple interpretations (and therefore multiple possible transformations).
Users are encouraged to report documents that are not successfully transformed by the stylesheet, especially those which do have valid DocBook V5.0 representations.
See http://www.relaxng.org/ for a list of tools that can validate an XML document using RELAX NG. Note that not all products are capable of evaluating the Schematron assertions in the schema.
This appendix registers a new MIME media type, "
For documents labeled as "
The following individuals were members of the committee during the formulation of this Working Draft:
Copyright © The Organization for the Advancement of Structured Information Standards [OASIS] 2001, 2002, 2003, 2004, 2005. All Rights Reserved.
OASIS takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on OASIS's procedures with respect to rights in OASIS specifications can be found at the OASIS website. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification, can be obtained from the OASIS Executive Director.
OASIS invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to implement this specification. Please address the information to the OASIS Executive Director.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to OASIS, except as needed for the purpose of developing OASIS specifications, in which case the procedures for copyrights defined in the OASIS Intellectual Property Rights document must be followed, or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by OASIS or its successors or assigns.
This document and the information contained herein is provided on an "AS IS" basis and OASIS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
OASIS has been notified of intellectual property rights claimed in regard to some or all of the contents of this specification. For more information consult the online list of claimed rights.
For information on wether any patents have been disclosed that may be essential to implementing this specification, and any offers of patent licensing terms, please refer to the Intellectual Property Rights section of the DocBook web page (http://www.oasis-open.org/committees/docbook/)
[RELAX NG] James Clark, editor. RELAX NG Specification (Committee Specification). OASIS. 2001.
[XML] Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, et. al., editors. Extensible Markup Language (XML) 1.0 (Third Edition). World Wide Web Consortium, 04 Feb 2004.
[XLink11] Steven DeRose, Eve Maler, David Orchard, Norman Walsh, editors. XML Linking Language (XLink) Version 1.1. World Wide Web Consortium, 2005.
[RFC 2119] IETF (Internet Engineering Task Force). RFC 2119: Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. 1997.
[RFC 3023] IETF (Internet Engineering Task Force). RFC 3023: XML Media Types. M. Murata, S. St. Laurent, D. Kohn. 2001.
[DocBook: TDG] Norman Walsh and Leonard Meullner. DocBook: The Definitive Guide. O’Reilly & Associates, 1999.
[SGML] JTC 1, SC 34. ISO 8879:1986 Information processing -- Text and office systems -- Standard Generalized Markup Language (SGML). 1986.
[W3C XML Schema] Henry S. Thompson, David Beech, Murray Maloney, et. al., editors. XML Schema Part 1: Structures. World Wide Web Consortium, 2000.
[W3C XML Datatypes] Paul V. Biron and Ashok Malhotra, editors. XML Schema Part 2: Datatypes. World Wide Web Consortium, 2000.
Be seeing you, norm -- Norman Walsh <email@example.com> | Human felicity is produced not so http://www.oasis-open.org/docbook/ | much by great pieces of good Chair, DocBook Technical Committee | fortune that seldom happen, as by | little advantages that occur every | day.--Benjamin Franklin