docbook message

Subject: RE: [docbook] Future DocBook Ruminations - Modular Source Files & XInclude

From: "Wills, Robert" <Robert.Wills@sts.co.uk>
To: Michael Smith <smith@xml-doc.org>, docbook@lists.oasis-open.org
Date: Wed, 20 Aug 2003 17:20:44 +0100

> You haven't really described what problems your proposal is meant to
> solve -- what the intended goal is.

Good point. Let's start at the beginning.

We have a team of developers contributing to some reasonably large manuals
(100-pages or so apiece). The DocBook XML source files are stored in CVS and
often different people will be editing different bits (files) of a document
at the same time. Some bits (files) are included in more than one master
document. Documents are fully validated, converted to PDF and staged on an
internal website only occasionally. I am in complete control of the
tool-chain used for conversion, but not the choice of editing programs used
to work on the source files.

Each source file (most are a <section>) declares a DOCTYPE with a local DTD,
and all the editors in use by the group which support validation can happily
read this DTD and report any grammar constraint errors. When we pull the
document together my tool-chain validates it again (e.g. to check IDs are
unique across the whole doc) and converts the resulting infoset to PDF. On
the whole, everyone's happy.

The snag is that to achieve this I cheated slightly by using a processor
directive to do the includes - the "master" XML file for one of our docs
currently looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V4.1.2.5//EN"
"../../../schemas/sdocbook/sdocbook.dtd">
<article>
 <articleinfo>
  <title>STS Worksuite 3.4 Application Server Platform Install Guide</title>
 </articleinfo>
 <?oodb-include href="intro.xml"?>
 <?oodb-include href="install_deps_win.xml"?>
 <?oodb-include href="install_worksuite_win.xml"?>
 <?oodb-include href="config_weblogic.xml"?>
 <!-- snip -->
 <para/> <!-- dummy para to satisfy grammar of article -->
</article>

At the time I set this up I was familiar with XML but relatively new to
DocBook and under time pressure, so I hacked first and asked questions
later.

Which is where I am now, asking questions about the "correct" way to do
this, before our document repository gets too big. What I don't want to lose
is the ability to validate these XML document fragment files independent of
the document (or documents) which include them, since this is where 90% of
the edit-validate-fix cycle happens in our rather distributed environment.
Many people use syntax-directed editors which 'understand' XML in a generic
way but are not specifically DocBook-aware, which seems to work really well.

So if I replace my hack ...

 <?oodb-include href="intro.xml"?>

... with ...

 <include xmlns="http://www.w3.org/2001/XInclude"; href="intro.xml"/>

... which seems the "correct" approach, the validating editors I have tried
immediately complain about the unrecognised <include> element. I've had a
quick look at the XInclude W3C spec. - the relevant passage seems to be:

"1.3 Relationship to DTDs
XInclude defines no relationship to DTD validation. XInclude describes an
infoset-to-infoset transformation and not a change in XML 1.0 parsing
behavior. XInclude does not define a mechanism for DTD validation of the
resulting infoset."

I read that as implying that these editors are doing the right thing by
complaining.

So I guess what I'm asking is for a way to keep some level of validation
really early in the food chain, where lots of small XML files holding the
source for only part of a document are being edited "in the wild" and are
not yet in a DocBook-specific editing/processing environment.

I thought about hacking my own variant DTD allowing <xi:include> elements to
be referenced by these <section> files, retaining the proper DTD in the
master document for full validation when the infosets are combined, but this
didn't seem exactly elegant either.

Am I missing something obvious here, or does everyone else with big
documents just take a "big-bang" approach to grammar validation by only
doing it when processing the whole document? 

Thanks,
Rob.

Follow-Ups:
- RE: [docbook] Future DocBook Ruminations - Modular Source Files & XInclude
  - From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- RE: [docbook] Future DocBook Ruminations - Modular Source Files & XInclude
  - From: Dave Pawson <dpawson@nildram.co.uk>