OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: DTD Implementation VS the DITA Abstraction

It occurred to me that in my recent discussions I may have given the 
impression of undervaluing the details of how the DITA DTDs and schemas 
are actually constructed. I may also appear to be callous with respect 
the needs of authors.

Neither is the case, it's just that we have to be careful to distinguish 
the abstractions that are being standardized from the implementation 
expressions of those abstractions. We also have to be careful to 
understand when those concerns apply: during standardization or during 
the implementation of standards-based systems.

The DITA DTDs and schemas are implementation expressions of the core 
DITA abstractions.  This means that, in the abstract, there can be any 
number of equivalent such implementations that are functionally 
equivalent and equally useful. In particular, the DITA *standard* needs 
to explicitly define a world in which there can be different but 
equivalent implementations.

However, for the purposes of providing useful reference implementations 
to the DITA community the implementation is very important. In this 
area, the work that the IBM DITA team has already done is of vital 
importance--it reflects lots of hard word, careful thought, and hard 

By the same token, while most of my focus so far has been on 
*processing* as opposed to authoring, authoring is of course vitally 
important to creating a complete information management system that will 
be used and used effectively. From an author's standpoint the key system 
features are clear element type names and appropriate content models.

Thus, when defining concrete DITA *applications* (to use the terminology 
I introduced in the Namespace resolution thread) the issue of element 
type names and content models are of vital importance. But these are the 
concerns of *applications*, not of the core DITA standard. Of course, 
because the DITA standard is also defining abstract types and our 
expectation is that those types will be used directly as element type 
names, we can't literally be arbitrary in our choice of DITA type names.

Also, because the element type names used in DITA applications are 
entirely up to the application designer and are not constrained by the 
DITA specification in any way, application designers have a lot of 
flexibility to do what will be best for their authors. This again means 
that element type names or details such as whether or not applications 
use namespace qualification need not be a direct concern to the DITA 
specification itself. It will be a concern to the reference DITA 
application, but I think we've already established that the only 
reasonable thing we can do in the 1.0 timeframe is to do essentially 
what IBM DITA does and use no namespace for element types.

Finally, many XML authoring tools provide ways to provide some form of 
"alias" for element type names, meaning that the user interface exposed 
to authors need not be directly constrained by the base element type names.

Of course, as an engineering principle, we want to implement things so 
that the simplest systems will be as effective as they can reasonably 
be, meaning that the default element type names should be well thought 
out and clear, but we don't have to worry over much about the 
implications for authoring because real production systems will almost 
always involve a fairly large degree of customization anyway.

I think one thing that may be derailing these discussions is that the 
IBM DITA developers have, appropriately, been primarily focused on 
authoring because they were developing an authoring support system and 
the DTDs were (and are) a key part of that system.

But the DITA standard is *not* primarily an authoring support system. It 
is a generic standard that defines core types and processing semantics 
that in turn provides a solid basis from which task-specific authoring 
support systems can be built. That's a key difference and requires a 
sometimes subtle shift in emphasis of requirements and features.

I think it comes down to this:

DTDs and schemas are, primarily, system components that support 
authoring and are important primarily in the context of authoring 
support systems. Processing systems don't care at all about DTDs except 
to the degree that they need either markup minimization (default 
attribute values) or require that documents pass a validation gate. But 
even for validation DTDs and schemas are either only part of the 
solution or can be completely replaced by validation applications (which 
you have to have if you must support schema-less documents). So 
ultimately you come to the conclusion that DTDs and schemas primarily 
support authoring and are at best a convenience for the rest of the 
system (and at worst an impediment because they have to be accounted for 
even when they aren't needed).

Because standardized DITA must, by the nature of standards, be primarily 
a processing and interchange standard (because authoring is always 
localized), it means that the focus of the standard will not be on the 
details of DTDs but on the abstract structures and business rules the 
DTDs are implementation expressions of.


W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(512) 372-8122


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]