[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Initial Experience With Current DITA Schemas
Executive Summary: 1. I feel strongly that we must define one or more (probably just one) namespace for DITA in the 1.0 release (we may have already decided this). 2. The current DITA schemas must be reworked to declare this namespace as their target namespace. ----------------------------------- How I Arrived At These conclusions: I have started a little research project, XIRUSS-T (xiruss-t.sourceforge.net), with the aim of both demonstrating various principles and techniques for versioned content management of compound documents that use XInclude (or it's moral equivalent) for re-use and providing a sandbox for experimentation. Part of the focus of the system is techniques for doing compound document import, as that is where much of the complexity of compound document management lies. A fundamental design feature of the XIRUSS system is that it only supports the use of schemas, not DTDs, in that if you import a document that uses a DTD it does nothing with the DTD reference and will not import any external DTD subset or parameter entities into the repository. However, if the document uses a schema, it will also import the schema (if not already in the repository) and will maintain as object metadata the dependency between the document and its governing schemas, as well as a mapping from name spaces to schemas. In addition, the import process uses the document's name spaces to determine what, if any, schema-specific import processing to apply to the document. For example, my code currently has an XSLT importer that recognizes XSLT documents and applies an XSLT-specific importer to them in order to import all the member documents of a multi-document XSLT transform, as well as base-level support for XInclude include references. I was trying to implement an importer for DITA documents. Computationally the problem is simple: just find all the topicrefs, elements with conref= attributes, and so on, and chase them down. My importer framework provides a simple model for doing this processing, making it a matter of a few minutes to implement a new importer of this sort. The problem I ran into was the way the current DITA schemas are defined. As provided in both the IBM distribution and the OASIS submissions the DITA schemas do *not* have a target name space. A document that uses the the DITA schemas does not declare any namespace for DITA, it just uses the noNamespaceSchemaLocation= attribute to point to the schema file. This exactly mirrors the way DTDs are used in XML and also demonstrates the reason that I chose *not* to support DTDs in XIRUSS: there is nothing about either the reference to the schema instance or the schema itself that enables a reliable mapping from an instance document to an abstract "document type" (that is, the set of business rules that govern a set of documents and their processing). This means that the current DITA schemas are just a set of syntactic constraints with no defined association to any abstract set of rules (such as the DITA specifications). Doh! This means that the XIRUSS system, in addition to not supporting DTDs, also can't support schemas with no target namespace. [I can support documents that have no global namespace as long as they either use pure XInclude for doing use-by-reference but I can't associate schema-specific processing with those documents.] I hadn't properly appreciated this until now. Doh! What this really comes down to is that in order for a document to be unambiguously associated with a set of business rules it *must* declare a root name space. Because the namespace spec explicitly says that one cannot presume that a namespace name relates to any particular schema (in the generic sense), to be completely clear the namespace must be associated with a schema, which can be done any one of three ways (schemaLocation= in instances, targetNamespace in schema documents, or through an application-specific namespace-to-schema mapping). The schema then becomes the physical representative of the larger set of business rules and their definitions that makes up a complete document type. The namespace the schema governs then becomes one true name for that document type. This suggests to me that DITA must define at least one namespace and must associate its schemas with that namespace. Without this there is no way to unambiguously know that a given document is in fact a DITA document (or formally derived from the DITA architecture). So I tried the experiment of putting all the various DITA schema files in a single namespace. This worked for the purposes of validating the documents (at least Turbo XML was happy, Stylus Studio 4.6 was not but I suspect that this old version of Stylus is just ignorant). But it tripped over some shortcomings in my current XSD importer process. The import worked to the degree that I was able to import all the topics directly or indirectly referenced by a map but the schema associations got a bit confused for reasons I won't bore you with. However, I wasn't completely happy with this schema design: I suspect that it would actually be a more accurate reflection of the true abstract DITA architecture to have distinct namespaces for the different layers of types and then, if necessary, use schema-level derivation to map base names in one namespace to the same name in the namespace of a specialized schema. I tried doing this with the current schemas and it didn't work at all, although I suspect that this was in part because Stylus, which I was using to edit the schemas, doesn't give me accurate validation feedback (reporting problems that aren't really problems). But it did start to feel like the current organization of the types just isn't right, in that it reflects the *syntactic* organization imposed by using parameter entities in DTDs and not the true specialization hierarchy of the abstract DITA architecture. I think we need to think very carefully about how namespaces will be used in both DITA 1.0 and 1.+. Cheers, E. -- W. Eliot Kimber Professional Services Innodata Isogen 9030 Research Blvd, #410 Austin, TX 78758 (512) 372-8122 eliot@innodata-isogen.com www.innodata-isogen.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]