Re: [dita] Some rambling thoughts about the domains attribute

I started writing a long reply to the effect that @domains is essential in order to make DITA documents self describing as regards their DITA document type, meaning the set of modules they use and/or allow.

That is, @domains serves as a declaration *on the root element* of what modules the document may reflect and therefore can be validated against and processed in terms of.

However, you can get the same information by simply examining all the @class values in the document. The only limitation is that this always reflects what’s actually in the document. But it is sufficient to the needs of processors that want to be able to receive arbitrary DITA documents and know everything they need to know about them (assuming the processors have copies of all the modules, or at least the base modules from which any specializations are derived, which can always be the case). For example, I might have some kind of document repository that aggregates DITA content from anyone who cares to submit it. @class values are sufficient to know “this is a topic”, “this is a map”, “this is a metadata element”, etc. I don’t actually care about grammars in this context. If I care to validate I can simply generalize to OASIS-defined types and validate that, which is sufficient to reject non-conforming instances. You can think of this as the case of widest-possible scope of DITA interchange.

DITA as an architecture is unique among XML applications in enabling documents that are completely self describing as (A) DITA documents (the @dita:DITAArchVersion attribute) and (B) what their intended vocabulary is (@domains).

But even without @domains a DITA document is pretty clearly a DITA document if it has @dita:DITAArchVersion and @class on the root element (or children of <dita> for composite docs, although <dita> is a pretty clear indicator of DITA-ness as well). There are many document types that use attributes named @class but none that also use @dita:DITAArchVersion except DITA (which is the whole point of having a DITA-defined namespace in use in DITA documents). If a document has @dita:DITAArchVersion but no @class attribute then it can’t be a conforming DITA document. If it has both then you know everything that @domains could tell you and possibly more than @domains does tell you since we don’t require @domains to list structural modules.

So maybe @domains is not as essential as I have until now thought that it was.

By this analysis it does very much look like Simon Says, an opportunity for error that nobody ever validates or checks (although George Bina did make a Schematron that validates that RNG shells’ domains definitions match the set of modules they include but I don’t think that really addresses Robert’s concerns). If you are not concerned with being able to validate grammar-less DITA documents against the set of modules they claim to use then @domains has very little value indeed beyond declaring attribute specialization. And I think it’s fair to say that the only person who ever considered this use case is probably me.

The use cases I’ve been concerned with historically are:

1. Content reference constraints. This was obviously the original motivation for @domains and as Robert points out there seems to be little actual requirement for this checking in practice.

2. Attribute specialization declaration. This is still important but could be done with an attribute that does only this and nothing else. Because you can’t have the equivalent of @class for attributes there’s really no other good solution that doesn’t involve processing instructions.

3. Recognition of DITA documents that don’t have associated grammars in order to determine what to do with them in the context of some general-purpose DITA processor. Because the @domains attribute enumerates the modules it would be possible to have a system that goes from @domains values to actual grammars for the modules named and does validation dynamically. But of course nobody actually does that in practice because our tools are all too dependent on grammar files and there is too much practical value in having attribute defaults, which again requires the direct use of grammars with documents.

As regards constraints, if you don’t care to validate content references (either at all or in terms of additional constraints) then adding the constraint declarations to @domains just adds noise. Otherwise constraints are entirely an authoring concern and can be handled simply through local configuration of grammars used in authoring (or through authoring-tool-specific configuration).

That is, as long as the document instances still conform to the unconstrained modules the fact that they were authored with constraints is of no interest to any receiver of the documents.

If there is an interchange scenario where some receiver of documents really cares that some set of constraints has been adhered to there are lots of ways to do that using existing tools (shared grammars, schematron, validation applications, etc.).

So I think that I’m surprisingly agreeing with Robert that the value of @domains is much lower than we thought and that we could do without it.

Cheers,

Eliot Kimber

http://contrext.com

From: <dita@lists.oasis-open.org> on behalf of Robert D Anderson <robander@us.ibm.com>
Date: Wednesday, January 4, 2017 at 11:24 AM
To: OASIS DITA TC List <dita@lists.oasis-open.org>
Subject: [dita] Some rambling thoughts about the domains attribute

Over time I've come to see less and less benefit + more and more pain from the domains attribute. (Not from domains themselves - repeat, not from domains themselves - just the tokens in that attribute). I think they stand out as one of those DITA things with high level of complexity, little actual benefit, and (outside of attribute domains) few or no repercussions if you mess it up.

I started writing out all the arguments I've heard in favor of the attribute, and why I think most of them are no longer reasonable. Eventually I ended up with a small book. It's way too much for an email thread, so I've posted the resulting thoughts on a blog, and am curious what people think:
http://metadita.org/toolkit/nonononodomains.html

Warning, you may want to fill up on coffee before taking a look.

Regards, Robert D. Anderson DITA-OT lead and Co-editor DITA 1.3 specification, Digital Services Group

E-mail: robander@us.ibm.com Digital Services Group
11501 BURNET RD,, TX, 78758-3400, AUSTIN, USA

dita message