dita message

Subject: Re: [dita] DITA Processing Model
From: Robert D Anderson <robander@us.ibm.com>
To: "Paul Prescod" <paul.prescod@blastradius.com>
Date: Tue, 25 Oct 2005 08:58:40 -0500
Hi Paul,

Responding specifically to the processing order, the current order in the
toolkit was reached based on a lot of trial and error with IBM documents.
At this point, a lot of those documents depend on the current order, so if
we are going to formalize on a processing order I'd certainly be interested
in keeping the one we have now. On the other hand, I have no objection to
letting users adjust the processing order. It does mean two users may get
different results with the same files, but that is the case with any system
where one user adjusts the pipeline to make their files work differently.

If you'd like more information on why each item runs in the current order,
I'm happy to give details, but I won't have time to do so for a couple of
days.

For IDs - my understanding is that non-topic IDs are supposed to be unique
within a topic. The DITA toolkit's processing pipeline operates under this
assumption. I believe it generates some warning messages for duplicate IDs
within a topic, although today you need to scan through the console log to
find them.

In terms of keeping intermediate files DITA-compliant, we have tried to do
so as much as possible, though I do not think it is required. To that end,
one of the things we do is adjust IDs that are pulled in by conref, so that
they do not conflict with IDs in the current topic. This is important for
later steps that retrieve cross reference text, and might otherwise have to
choose between multiple copies of the same ID. For example, you may have a
cross reference to a table with id="info". If you use conref to pull in a
paragraph or a list item with id="info", then your cross reference of
"#topic/info" will actually be pointing to both.

Robert D Anderson
IBM Authoring Tools Development
Chief Architect, DITA Open Toolkit


                                                                           
             "Paul Prescod"                                                
             <paul.prescod@bla                                             
             stradius.com>                                              To 
                                       "Michael Priestley"                 
             10/21/2005 01:18          <mpriestl@ca.ibm.com>               
             PM                                                         cc 
                                       <dita@lists.oasis-open.org>         
                                                                   Subject 
                                       [dita] DITA Processing Model        
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           






 From: Michael Priestley [mailto:mpriestl@ca.ibm.com]
 Sent: Tuesday, October 11, 2005 3:36 AM
 To: Paul Prescod
 Cc: dita@lists.oasis-open.org
 Subject: Re: [dita] Repeated conrefs


 Comments below...

 Michael Priestley
 IBM DITA Architect
 SWG Classification Schema PDT Lead
 mpriestl@ca.ibm.com

 "Paul Prescod" <paul.prescod@blastradius.com> wrote on 10/06/2005 07:50:36
 AM:

 > By definition, every element that is conrefed has an “id” attribute.
 > Is it therefore invalid to conref the same element into the same
 > topic twice?

 It is still valid. The id on most elements is not constrained to be
 unique, precisely to allow this as a valid case.

 [ PAUL PRESCOD] I do not agree that the id on most elements is not
 constrained to be unique. As I understand it, the id on every elements is
 constrained to be unique WITHIN SOME CONTEXT. If you conref a topic into
 the same file twice then you will run into a per-file uniqueness problem.
 If you conref a paragraph into a topic twice then you will run into a
 per-topic uniqueness problem.

 >And to conref the same topic into a parent topic twice?

 This is not so valid. Topic-level elements must have unique ids. If the
 same topic gets conref'd into a document in multiple places, this does
 produce an error once the resolved document is parsed, and the conref
 processor doesn't currently check for that.

 [ PAUL PRESCOD ] What in the DITA specification would lead me to
 understand that this is not valid? Obviously having two topics with the
 same ID is not possible in a DITA input file. But it isn't clear to me
 what constraints there are on the post-CONREF output. It isn't even clear
 to me whether the post-CONREF output is supposed to be uniformally DITA
 compliant. Similar questions apply to the post-FILTERING output, the
 post-GENERALIZATION output and the post-MAP-combination output.The
 questions are compounded when you combine these transformations.

 SGML, XML 1.0 and the XML Family have had similar specifications problem
 and the answers to these questions were only ever answered by picking the
 brains of the architects (which, in the case of the XML family of
 standards was almost impossible because the architects for different specs
 did not necessarily communicate). Having been through that process three
 times, I'd rather get the answers into the DITA spec while we still have
 time before we are mobbed with hundreds of implementors who are not
 content to just copy the DITA Toolkit. In particular, I am proposing a
 formal DITA processing model as per the XML processing model proposed
 here:

 http://www.mnot.net/papers/XMLProcessingWS.html

 And I am proposing it for the same basic reasons:

 "The original motivation for defining a processing model was to normalize
 the application of W3C-defined mechanisms; it is a very different thing to
 apply XSLT and then XInclude, as opposed to XInclude and then XSLT."

 Similarly, it is a very different thing to filter before conref than to
 conref before filter . For example, if you filter before conref, then a
 conref to a filtered element is a broken link. But if you conref before
 generalizing then there is an intermediate state where a topic may have
 elements within it that are not allowed by its content model. So DITA
 should either specify the order of these things (hard-coded) or provide a
 way for DITA users to specify the order (some kind of pipelining
 language).

 >
 > Is it legal for an element with a DITA conref to reference another
 > element with a conref? If so, shouldn’t DITA disallow mutually
 > recursive conrefs?

 It should, it's just an edge case that is expensive to check for, so I
 don't know of any process that currently does.

 Okay, so we agree that the specification should change in this way. Is
 there a tool we should use to keep track of this agreement? i.e. an issue
 tracking system?

  Paul Prescod
References:
- DITA Processing Model
  - From: "Paul Prescod" <paul.prescod@blastradius.com>