OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Article for "DITA elements in strange places" action

I found a previous reply on dita-users that I had written about observed quirks in DITA; I dusted this off and recognized that it is actually sort of a mini-FAQ about several related features of the specialization architecture. It was written prior to the introduction of constraints, and so I have added a draft-comment for the TC to help with that wording about implications for tag count. If we could review this today,


And the readable version:

Quirks and traits of the DITA specialization architecture

Because DITA represents an extensible content architecture (that is, a content owner can specialize a prior content model into a more specifically name and constrained derivation of the parent), its content models have some sometimes quirky behaviors.

The "topic" topic type is designed to generally represent nearly all the common types of discourse used by authors for various needs. This design goal means that a basic DITA topic can represent nearly any commonly observed content structure, and with the fewest possible elements. As a result, the DITA topic and some specializations may manifest some behaviors worth knowing about.

Why is the base topic as inclusive as it is?

An archetype must be general enough to support the widest variety of current and yet-to-be-imagined specializations. For example, in the base topic, paragraphs allow lists as content. In practice, some organizations might have authoring guidelines that eschew lists in paragraphs, while others might require a special paragraph that binds introductory content with specialized lists in order to handle both by containment in processing. The archetype supports both use cases through the use of more specialized derivative content models that support the necessary data model and business rules for each type of content.

Why are DITA elements and attributes sometimes in strange places?

Whenever a new element is specialized from an existing element, it needs to be inserted into the content model in a controlled way so that it is valid in its new contexts. This is done by effectively cloning it as a peer element of its copy by a step in the specialization design pattern called vocabulary substitution (cf?). Hence, a new "specialpara" element only appears in the same contexts as its original "p" element. And yet, both are now allowed, either/or, if the original "p" element remains in the content model. Specialized property attributes are likewise cloned into the same attribute contexts as their base forms.

Specialization may thus cause "tag count" creep that can be managed by specifically limiting the declared content models of cloned structural elements or by making use of additional constraints notation to "notch out" elements that are no longer required after specialization.

TC, please verify/reword: [In effect, this approach replaces the former element with the new element, resulting in no net growth in tag count.]

What are the strange elements in some content models?

Some elements in the base topic vocabulary, like ligroup and figgroup, are intended to enable future specializations rather than to model generic data. These are not part of common discourse, but they are important grouping structures for more useful specializations of the list and fig elements.

What does the archetype-based design mean in terms of "tag count?"

Rigorous identification of archetypes also leads to a reduction of tags.For example, most inline phrases are basically one of a few, distinctly different, fundamental types: keywords and terms that are not nestable (because they are atomic types of text), and general text phrases that are nestable. The more semantically significant derivations of these basic types have been moved into optional domains, with the result that the basic topic DTD without domains actually has fewer body elements than HTML. Remaining phrase-like elements like tm, state, and data are phrase types that have more specific metadata and processing options. The state element is an interesting archetype in its own right (think flags for semiconductor discussion, or logic elements in a flow diagram).

  • "Where is the wisdom we have lost in knowledge?
  • Where is the knowledge we have lost in information?"
  • --T.S. Eliot

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]