OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-tc message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: DocBook V5.0 spec


It still needs (a lot) of work, but here's a first cut.

Title: The DocBook Schema

The DocBook Schema

Working Draft 5.0b1, 09 Jul 2004

Document identifier:

wd-docbook-docbook-5.0b1 (XML, HTML, PDF)

Editor:

Norman Walsh, Sun Microsystems, Inc. 

Abstract:

DocBook is general purpose [XML] schema particularly well suited to books and papers about computer hardware and software (though it is by no means limited to these applications).

The Version 5.0 release is a complete rewrite of DocBook in RELAX NG. Although other forms will also be provided, including W3C XML Schema and XML and SGML DTDs, the RELAX NG Schema is now the normative schema.

Status:

This Working Draft is an editor's draft. It does not necessarily represent the consensus of the committee.

Please send comments on this specification to the list. To subscribe, please use the OASIS Subscription Manager.

The errata page for this specification is at http://docbook.org/specs/docbook-errata.html.



1. Introduction

DocBook is general purpose XML schema particularly well suited to books and papers about computer hardware and software (though it is by no means limited to these applications).

The DocBook Technical Committee maintains the DocBook schema. Starting with V5.0, DocBook is normatively available as a [RELAX NG] Schema. W3C XML Schema and Document Type Definition (DTD) versions are also available.

The Version 5.0 release is a complete rewrite. It introduces a large number of backwards-incompatible changes. Essentially all DocBook V4.x documents will have to be modified to validate against DocBook V5.0. An XSLT 1.0 stylesheet is provided to ease this transition.

The DocBook Technical Committee welcomes bug reports and requests for enhancement (RFEs) from the user community. The current list of outstanding requests is available through the SourceForge tracker interface. This is also the preferred mechanism for submitting new requests. Old RFEs, from a previous legacy tracking system, are archived for reference.

2. Terminology

The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this Working Draft are to be interpreted as described in [RFC 2119]. Note that for reasons of style, these words are not capitalized in this document.

3. The DocBook RELAX NG Schema V5.0

The DocBook RELAX NG Schema is distributed from the DocBook site at OASIS. DocBook is also available from the mirror on http://docbook.org/.

3.1. Changes in DocBook V5.0

In V5.0, DocBook has been redesigned as a native RELAX NG grammar. The goals of this redesign were to produce a schema that:

  1. “feels like” DocBook. Most existing documents should still be valid or it should be possible to transform them in simple, mechanical ways into valid documents.

  2. enforces as many constraints as possible in the schema. Some additional constraints are expressed with Schematron rules.

  3. cleans up the content models.

  4. gives users the flexibility to extend or subset the schema in an easy and straightforward way.

  5. can be used to generate XML DTD and W3C XML Schema versions of DocBook.

In light of the fact that this is a complete rewrite, the Technical Committee gave itself the freedom to make unannounced backwards-incompatible changes for this one release.

3.1.1. Smaller Content Models

The content models of many inlines have been reduced, sometimes drastically. The parameter entity customization of DocBook V4.x and previous versions resulted in very broad content models for some inlines (consider term in DocBook V4.4, for example).

DocBook V5.0 may be too restrictive in this area.

3.1.2. Uniform Info Elements

DocBook V4.x has setinfo, bookinfo, chapterinfo, appendixinfo, sectioninfo, etc. DocBook would be smaller and simpler if it had a single info element in all these places.

There’s an historical reason for the large number of unique names: customizers might very well want to adjust the content models of info elements at different levels. For example, a copyright statement might be required at the book level, or an an author forbidden at the sub-section level. In DTDs, there’s only one content model allowed per element name. So in order to support independent customization, each info element must have a different name.

In RELAX NG, no such limitation exists. We can use patterns to achieve both a single info element while still allowing customizers to change its content model in different contexts.

3.1.3. Required Titles

DocBook V5.0 enforces the constraint that titles are required on articles and other large structures where they are effectively optional in DocBook V4.x. (They are optional only in the sense that DTDs are unable to enforce the constraint that they be present, the documentation has always made it clear that titles were required.)

3.1.4. Required Version

In DocBook V4.x and earlier, the presence of a document type declaration served as a mechanism for identifying the DocBook version of a document. Although the declaration was not actually required, it was present in the vast majority of DocBook documents.

In RELAX NG, no similar declaration exists. Although a document type declaration might still be present, it seems likely that this will not usually be the case.

Nevertheless, downstream processors may need to have some indication of the version of DocBook being used. As a result DocBook V5.0 adds a new version attribute which must be present on the document element of a DocBook document.

Mixing versions is explicitly allowed and the version attribute may be used on other elements as well. This might be the case, for example, in a compound document.

3.1.5. Co-Constraints

DocBook V5.0 enforces attribute co-constraints such as the class/otherclass attributes on biblioid.

3.1.6. Improved HTML and CALS Table Support

In DocBook V5.0, HTML tables and CALS tables are independently specified. Where the DTD of DocBook V4.x allows for incoherent models, DocBook V5.0 forbids them.

3.1.7. Datatypes

DocBook V5.0 adds a few simple data types. For example, the cols attribute on tgroup must be a positive integer.

Some of these constraints, such as the requirement that elements like pubdate include a proper date-time type, may prove controversial.

3.1.8. Universal Linking

Starting with DocBook V5.0, the linkend and href attributes are available on almost all elements.

The linkend attribute provides an ID/IDREF link within the document. The href attribute provides a URI-based link.

The link and ulink elements have been removed from DocBook as these linking constructs can now be achieved directly from the appropriate inline (such as productname or command). For instances where no specific semantic inline is needed, use phrase.

3.1.9. Extra-Grammatical Constraints

Grammar based validation technologies (like RELAX NG) and rule based validation technologies (like Schematron) are naturally complimentary. Mixing them allows us to play to the strengths of each without stretching either to enforce constraints that they aren’t readily designed to enforce.

For example, DocBook NG requires that the root element of a document have an explicit version attribute. Because there are a great many elements that can be root elements in DocBook, and because they can almost all appear as descendents of a root element as well, it would be tedious to express this constraint in RELAX NG. But it would be easy in a rule-based schema language.

DocBook V5.0 uses Schematron where appropriate.

3.1.10. Customization

TBD. RELAX NG patterns enable easy customization.

3.1.11. Conversion

TBD. There’s an XSLT 1.0 stylesheet for performing conversion from DocBook V4.x to DocBook V5.0.

4. Release Notes

TBD.

A. The DocBook Media Type

This appendix registers a new MIME media type, "application/docbook+xml".

1. Registration of MIME media type application/docbook+xml

MIME media type name:

application

MIME subtype name:

docbook+xml

Required parameters:

None.

Optional parameters:
charset

This parameter has identical semantics to the charset parameter of the application/xml media type as specified in [RFC 3023] or its successors.

Encoding considerations:

By virtue of DocBook XML content being XML, it has the same considerations when sent as "application/docbook+xml" as does XML. See [RFC 3023], Section 3.2.

Security considerations:

Several DocBook elements may refer to arbitrary URIs. In this case, the security issues of RFC 2396, section 7, should be considered.

Interoperability considerations:

None.

Published specification:

This media type registration is for DocBook documents as described by [DocBook: TDG].

Applications which use this media type:

There is no experimental, vendor specific, or personal tree predecessor to "application/docbook+xml", reflecting the fact that no applications currently recognize it. This new type is being registered in order to allow for the deployment of DocBook on the World Wide Web, as a first class XML application.

Additional information:
Magic number(s):

There is no single initial octet sequence that is always present in DocBook documents.

File extension(s):

DocBook documents are most often identified with the extension ".xml".

Macintosh File Type Code(s):

TEXT

Person & email address to contact for further information:

Norman Walsh, .

Intended usage:

COMMON

Author/Change controller:

The DocBook specification is a work product of the DocBook Technical Committee at OASIS.

2. Fragment Identifiers

For documents labeled as "application/docbook+xml", the fragment identifier notation is exactly that for "application/xml", as specified in [RFC 3023] or its successors.

B. OASIS DocBook Technical Committee (Non-Normative)

The following individuals were members of the committee during the formulation of this Working Draft:

  • Steve Cogorno

  • Adam Di Carlo

  • Paul Grosso

  • Dick Hamilton

  • Nancy Harrison

  • Scott Hudson

  • Mark Johnson

  • Jirka Kosek

  • Larry Rowland

  • Michael Smith

  • Robert Stayton, Secretary

  • Norman Walsh, Chair, Editor

C. Notices

Copyright © The Organization for the Advancement of Structured Information Standards [OASIS] 2001, 2002, 2003, 2004. All Rights Reserved.

OASIS takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on OASIS's procedures with respect to rights in OASIS specifications can be found at the OASIS website. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification, can be obtained from the OASIS Executive Director.

OASIS invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to implement this specification. Please address the information to the OASIS Executive Director.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to OASIS, except as needed for the purpose of developing OASIS specifications, in which case the procedures for copyrights defined in the OASIS Intellectual Property Rights document must be followed, or as required to translate it into languages other than English.

The limited permissions granted above are perpetual and will not be revoked by OASIS or its successors or assigns.

This document and the information contained herein is provided on an "AS IS" basis and OASIS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

OASIS has been notified of intellectual property rights claimed in regard to some or all of the contents of this specification. For more information consult the online list of claimed rights.

D. Intellectual Property Rights

For information on wether any patents have been disclosed that may be essential to implementing this specification, and any offers of patent licensing terms, please refer to the Intellectual Property Rights section of the DocBook web page (http://www.oasis-open.org/committees/docbook/)

E. Revision History

Revision Working Draft “Beta 1” 09 Jun 2004

References

Normative

[RELAX NG] James Clark, editor. RELAX NG Specification (Committee Specification). OASIS. 2001.

[XML] Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, and Eve Maler, editors. Extensible Markup Language (XML) 1.0 Second Edition. World Wide Web Consortium, 2000.

[RFC 2119] IETF (Internet Engineering Task Force). RFC 2119: Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. 1997.

[RFC 3023] IETF (Internet Engineering Task Force). RFC 3023: XML Media Types. M. Murata, S. St. Laurent, D. Kohn. 2001.

[DocBook: TDG] Norman Walsh and Leonard Meullner. DocBook: The Definitive Guide. O’Reilly & Associates, 1999.

Non-Normative

[SGML] JTC 1, SC 34. ISO 8879:1986 Information processing -- Text and office systems -- Standard Generalized Markup Language (SGML). 1986.

[W3C XML Schema] Henry S. Thompson, David Beech, Murray Maloney, et. al., editors. XML Schema Part 1: Structures. World Wide Web Consortium, 2000.

[W3C XML Datatypes] Paul V. Biron and Ashok Malhotra, editors. XML Schema Part 2: Datatypes. World Wide Web Consortium, 2000.



                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com>      | There is always some accident in
http://www.oasis-open.org/docbook/ | the best of things, whether
Chair, DocBook Technical Committee | thoughts or expressions or deeds.
                                   | The memorable thought, the happy
                                   | expression, the admirable deed are
                                   | only partly yours.--Henry David
                                   | Thoreau

PGP signature



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]