We've been working on converting our multiple-book, multiple-set documentation system from Docbook SGML 4.1 to Docbook XML 5.0. So far, with some help from this list (Bob Stayton in particular), things have been going well. Now a question has come up for which we again ask your advice.
We begin our conversion process (SGML 4.1 to XML 4.5) in our current document processing system in Linux, using osx, and then complete it (XML 4.5 to XML 5.0) in oXygen in Windows, our new system for maintaining the source documents in XML, and producing output.
In SGML, we made extensive use of text entities to create replaceable text strings, as well as SYSTEM entities to organize and replicate larger sections of text such as chapters and sections of books. Taking advantage of SGML's flexibility, we were able to insert identical blocks of text into different places in a single book, or in different books of a set.
For example, with this markup for an appendix in, say, Book 1:
and this entity declaration:
<!entity app-gpl SYSTEM "app-gpl.sgml">
we were able to include the text of the GNU General Public License in a "GPL" appendix. The entity &app-gpl; points to a file in a common directory that contains all of the text of the appendix except for the <appendix></appendix> markup. When the entity gets resolved, the correct, marked up content of the appendix would be included in the ouput.
Using the ID in the <appendix> tag, we were able to create links within Book 1 to the "GPL" appendix. For Book 2, we would use an ID of "book2-ax-gpl" for that tag, and so on for each book. In this way we could create a <set> of books in which the links of each book would go to its own copy of the "GPL" appendix.
As I understand it, this was considered acceptable practise in SGML. However, my reading of Bob Stayton's book and the oXygen documentation has led me to the conclusion that in Docbook XML this is not possible, at least not using xinclude. With xinclude each section of text in a given document must have a unique ID, and each section of text must be completely valid in itself. We need to replicate identical chunks of text in books within a bookset, yet give each a unique ID. Because our books were designed from the beginning with this capability in mind, we use it quite a bit, and would like to continue using this approach with Docbook XML 5.0.
Faced with this challenge, we have tried taking the same approach in XML as we were using in SGML, but instead of using SYSTEM entities, we make them text entities. That is, we take the contents of the file app-gpl.sgml and copy it into the "" string for the text entity.
The good news is that the document validates and we get correct output. However, on the back end it means that instead of neat rows of pointers to filenames, our entity declaration file is quite large, as it now contains all of the text and markup content of all of the common files. So, the question we put to you is: Is this an acceptable approach (or the only approach) to the problem, or is there a better/recommended alternative?
Thank you in advance for your help.
Bob McIlvride Communications Manager
Cogent Real-Time Systems Inc., a Skkynet company
T 1 905 702 7851 ext 103