OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-tc message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [docbook-tc] citation proposal

Hi all,

If you're looking to add support for citations as typically found in
scholarly works, you might want to look at what was done in the NLM
archiving DTD (see http://dtd.nlm.nih.gov/). There was a lot of both
document analysis as well as analysis of more than 25 STM publisher DTDs
that went into developing this DTD.

In particular, the markup of references and citations to them from the body
required a fair bit of flexibility because almost every publisher has
unique requirements in this area based on their specific production and
archive requirements. For example, the NLM DTD carefully sets the citation
element to IDREFS, not IDREF because some instances of name/date style
citations are difficult to code in XML and then render appropriate with
only IDREF. Also, whether or not to keep PCDATA between elements for
punctuation and other "small" words in the reference section is a topic of
much debate, and so the model was setup with parameter entities to allow
redefinition as a non-PCDATA model.

Finally, not to be a heretic, but to point out useful options (don't get me
wrong here because I really like Docbook for many applications), if you are
working with scholarly content, you may want to look into the NLM DTD. It
has been explicitly developed for scholarly publishing, independent of the
needs of any specific publisher or discipline. In addition, it's a modular
DTD that's been explicitly designed for publishers to customize it as
necessary for their specific needs. The DTD is in the public domain, so you
can freely take it and modify it. Finally, most scholarly publishers are
concerned about long term archive issues, and as this DTD was funded in
part by the Mellon Foundation for the needs of libraries and other
archiving institutions, it is likely to be the DTD of choice for scholarly
publishers when archiving content.

Best regards,


At 05:15 PM 10/20/2003 , Bob Stayton wrote:
>This proposal is for RFE 810932 "Improved citation support".
>I'm not the author; I'm channeling this proposal for three DocBook
>users Bruce D'Arcus, Peter Flynn, and Markus Hoenicka. If Norm
>finds room for this item on the agenda, I wonder if it would be
>appropriate to invite one of them to sit in on that part of the
>call to answer questions.
>Bob Stayton
>Sagehill Enterprises
>----- Forwarded message from Bruce D'Arcus <bdarcus@fastmail.fm> -----
>Date: Sun, 19 Oct 2003 22:46:49 -0400
>Subject: citation proposal
>From: Bruce D'Arcus <bdarcus@fastmail.fm>
>To: Bob Stayton <bobs@sco.com>
>OK Bob,
>After wrangling over this for awhile, here is  
> the proposal the three of us have settled on.  A group of us have been  
> discussing improved bibliographic and citation support in XML for about  
>the past year, though much of that has been focused on metadata models  
>and xslt-based formatting.  Still, these basic DTD changes are  
>essential to making DocBook suitable for scholarly work, which I once  
>read Terry Allen had considered as a goal.
>Let me know how things proceed, and if anyone has any comments and/or  
>questions.  Best place to go for those interested is the bibliofile  
>page, where there is a link for the mailing list:
>Improved citation support in DocBook
>Citation support in DocBook is weak. In order to improve it,
>some small code changes are needed.
>1. Terminology
>A citation describes the intellectual origin of a stretch of text,
>regardless of whether this is a literal quote, an edited excerpt, or a
>statement based on the content of sources other than the author's
>personal work. It can also be used to refer the reader to additional
>information that goes beyond the scope of the document. A citation can
>contain bibliographic references, although it is more common to use
>pointers to bibliographic references. As the knowledge compiled in a
>statement may be drawn from several sources, it is sometimes necessary
>to use two or more references or pointers to references in a
>citation. In addition, a citation may contain other explanatory text.
>A bibliographic reference is a "self-sufficient description of a
>bibliographic item" (as the TEI guidelines define it), and as such
>usually sufficient to locate a printed or electronic copy of the
>referenced work. It is common to collect bibliographic references in a
>list at the end of a document or of a chapter.
>A pointer to a bibliographic reference is a cross-reference that links
>citations to bibliographic references, thus eliminating the need to
>provide the bulk of the bibliographic information in the text
>flow. The pointer is usually rendered using a citation key, the number
>of the bibliographic reference in the reference list, or an
>author/year representation of that reference.
>The following graph outlines the relationship of these three items
>(use a fixed font for display if it doesn't seem to make sense):
>Mainframe computers have gained widespread acceptance as a replacement
>for slide rules (Miller 1999; Doe 2000).
>                  ^---------^
>                  pointer to reference 1
>                               ^------^
>                               pointer to reference 2
>                 ^---------------------^
>                  citation
>Miller,A: A survey of the applications of mainframe   < reference 1
>           computers. Adv.Sci.Comp. 13:497, 1999.    
>Doe, B: Mainframes and numeric mathematics. Am.J.Eng. < reference 2
>         54:87, 2000.                                  
>DocBook contains sufficient support to encode bibliographic references
>(<bibliography> and related elements). However, the support for
>pointers to bibliographic references should be extended to make
>DocBook more versatile. The changes are proposed 1) to
>make the formatting of citations and bibliographic references
>according to a publisher-supplied style specification feasible and 2)
>to allow DocBook to be used for documents that have more demanding
>requirements for citations.
>2. New attribute "renderas" for the <citation> element
>Citations may be used in different ways by an author. This may
>influence the processing expectations of <citation> elements. The
><citation> element should be extended with an attribute that allows an
>author to select a specific processing expectation.
>1) Citation outside of the text flow
>This is the most common case. The citation is to be rendered outside
>the text flow, for example in brackets or as a superscript (this is at
>the discretion of the stylesheet or of a processing application):
>Computers require an operating system (Miller et al., 1999).
>Computers require an operating system [1].
>2) Citation in the text flow
>Sometimes it is required to integrate parts of the bibliographic
>reference into the text flow. These parts must still retain their
>function as a pointer to a bibliographic reference:
>Miller et al. (1999) analyzed 250 common computer models and concluded  
>all of them required an operating system.
>Miller et al. [1] analyzed 250 common computer models and concluded that
>all of them required an operating system.
>In this case, both "Miller et al." and "(1999)" or "[1]",
>respectively, are citations with one pointer to a bibliographic
>reference each. However, their integration into the text flow requires
>that each is rendered differently and in a different way compared to 1).
><citation renderas="full"><biblioref linkend="Miller1999"
><citation renderas="author"><biblioref linkend="Miller1999"
><citation renderas="year"><biblioref linkend="Miller1999"
>Code required:
>Addition of renderas to the ATTLIST of <citation> as NMTOKEN #IMPLIED
>Level: essential
>3. Addition of new <biblioref> and <bibliospec> elements
>While it is possible to use the existing <xref> element in a
><citation> to encode pointers to entries in a <bibliography> (please
>note the striking identity in the semantics of a pointer and <xref>),
>the <xref> element is not suitable to carry additional bibliographic
>information that applies only to the current citation. For example, if
>the bibliographic reference describes a book, a citation may
>specifically refer to a chapter or to a range of pages in that book.
>Think of the proposed <biblioref> as an extension of <xref> that
>allows children, namely <bibliospec>, to specify additional
>bibliographic information. Applications are expected to process this
>element in a way that uses both the information provided in the
>bibliographic reference pointed to (e.g. a citation key, the number of
>the entry in the bibliography, or an author/year representation of the
>reference) and the additional information provided in the children. If
>a <citation> contains more than one <biblioref>, processing
>applications are expected to render them as a unit. For example,
>pointers to consecutive entries in a numbered bibliography may be
>rendered as "[1-3]".
>The <bibliospec> element is preferable to allowing #PCDATA in
><biblioref> because the formatting of the provided information should
>be left to stylesheets. For example, a range of pages may be
>rendered as "pages 12 through 15", "pp 12-15", or maybe as "pp 12 sq".
><citation><biblioref linkend="Miller1999"><bibliospec unit="stanza"
>start="2" /><bibliospec unit="line" start="3" stop="4"
>Code required:
>Addition of elements with the following content models and attributes:
><!ELEMENT biblioref (bibliospec*)>
><!ATTLIST biblioref linkend IDREF #IMPLIED
>                     endterm IDREF #IMPLIED>
><!ELEMENT bibliospec EMPTY>
><!ATTLIST bibliospec unit NMTOKEN #REQUIRED
>                      start NMTOKEN #REQUIRED
>                      stop NMTOKEN #IMPLIED>
>Inclusion of <biblioref> into the content model of <citation>
>If the inclusion of two elements seems excessive, we may consider to
>use a simplified <biblioref> element:
><!ELEMENT biblioref (bibliospec*)>
><!ATTLIST biblioref linkend IDREF #IMPLIED
>                     endterm IDREF #IMPLIED
>                     unit NMTOKEN #REQUIRED
>                     start NMTOKEN #REQUIRED
>                     stop NMTOKEN #IMPLIED>
>This restrics the author to using a single level of bibliographic
>information, like a page range or a chapter range, but not both at a
>time. This may still be sufficient for most purposes.
>Level: essential
>4. Specification of navigational information in citations
>Add free-text caption or instructional text to citations to
>direct the reader.
>Example: <citation refs="Smith99" caption="left figure">...
>Code required: add an attribute "caption CDATA #IMPLIED"
>                     to citation.
>Alternative: add caption element type to the content model
>                  <!ELEMENT citation %ho; (%para.char.mix;|caption)*>
>Level: important.
>5. Add <biblioref> to the content model of element types implying  
>Add <biblioref> to the content model of <quote>, <blockquote> and  
><quote>A quote <biblioref linkend="Smith1999"><bibliospec
>unit="page" start="22" stop="23 /></biblioref></quote>
>Code required:
>Extend the content models of <blockquote> and <epigraph> to allow
><biblioref> elements.
>Level: important
>----- End forwarded message -----
>To unsubscribe from this mailing list (and be removed from the roster of the 
>OASIS TC), go to 

Bruce D. Rosenblum
Inera Inc.
815 Washington St. #3
Newton, MA 02460
617-969-3053 (office)
617-969-4911 (fax)

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]