dita message

Subject: RE: [dita] Fw: [dita-comment] #12013 Referencing a range of elements

From: Deborah_Pickett@moldflow.com
To: "Yas Etessam" <yas.etessam@justsystems.com>
Date: Wed, 8 Aug 2007 16:21:33 +1100

Hi Yas (and TC),

Comments below.

--
Deborah Pickett
Information Architect, Moldflow Corporation, Melbourne
Deborah_Pickett@moldflow.com

"Yas Etessam" <yas.etessam@justsystems.com> wrote on 08/08/2007 02:03:09 PM: > For the pre-processing issue, moving to have the toolkit change the > pre-processing flow is the best suggestion.

I agree. (Does this have nasty consequences for processing? I sort of see why DITA-OT filters before conref - it's a case of whittling down the working set size as early as possible - but range conref is something that this optimization breaks.)
> The ability for a user to select "all children" of an element for > conreffing purposes is a new use case and a specific problem area > that 12013 was not intended to solve. Now that it is on the table > and as there are multiple conref proposals on the table for 1.2 > (12013, 12014, 12015), my suggestion is that this particular use > case might be best expressed as a new proposal number, or forked off > as 12013B and then go through the standard process to validate that > we want to solve this in DITA 1.2 and then come up with a solution design.
Fair enough. Let's see what others say on the list and at at TC meetings, and if there's consensus I am happy to write something up for this use case. I have a couple of ideas, the "all children" version being just one possible solution, so which way it goes depends on how the existing proposals fare at TC meetings. I just want to be sure that all the various conref proposals cohere well enough that they don't come across as disjoint brainchildren of their respective TC members. (Which is incidentally why I am having cold feet over my own #12035. I've painted enough bikesheds already.)

> In terms of generalization, adding the assumption that > generalization processing will be the same as standard element > processing during a conref ( text nodes would be the exception) is > implied by the fact that this range markup is only meant to provide > a short hand: > Source Document A > <ph conref="sharedstuff.xml#sharedstuff/b1" conreftype="start"> > <ph conref="sharedstuff.xml#sharedstuff/b3" conreftype="end"> > Where the target topic look like this > <topic id="sharedstuff"> .... > <ph id="b1">.. > <ph id="b2">.. > <ph id="b3">.. > When processed (and generalized) should have the same results as: > Source Document B > <ph conref="sharedstuff.xml#sharedstuff/b1"> > <ph conref="sharedstuff.xml#sharedstuff/b1"> > <ph conref="sharedstuff.xml#sharedstuff/b1">
Sorry, I didn't make my point very well in my last message. Here's what I mean.

Source Document: <ph conref="sharedstuff.xml#sharedstuff/b1" conreftype="start"> 
Where the target topic look like this: <topic id="sharedstuff"> .... .. ..
<systemoutput>.........</systemoutput>
<ph conref="otherfile.xml#topicid/userinput"/> ..
By standalone-conref rules, b1 should become <ph>. b3 is already , so no problem. What about b2? <ph> or ? What about the <systemoutput>? What about the inside the <systemoutput>? What about the included <ph> that points to a <userinput> (doing its own generalization)?

Choices are:
- This is not allowed, and element names must match (hence no generalization during ranged conref). But it's allowed for non-ranged conref, and you still have to decide how indirect conref (the <userinput> example) is handled.
- Generalize only the start/end elements (because we know what they are). Leave intermediate elements as general as the included domains allow. Likely valid (and compatible with point-conref), but oddly asymmetric.
- Generalize the intermediate elements "to the same extent" as the fencepost elements (whatever that means).
- Don't generalize at all. Ignore the elements doing the conref, replace with the pulled elements and hope for the best that the result is valid. But you still have to deal with indirect conref (<userinput> again). This is sort of what <topicref> with format="ditamap" does, and we all seem to be OK with that. But is this even conref any more?
There might be others.

Even authoring tools could produce such markup in the case where the puller permits fewer domains than the pulled topic.
> Assuming that the conref resolution/filtering order can be swapped, > the resulting XML documents after processing need to be the same. > The major delta between regular conref and the ranged conref is the > fact that the intermediary text nodes will only come along with the > range as there is no way to addresss a direct conref to a text node.

... also that intermediate elements without IDs come along for the ride. That's probably worth mentioning explicitly because up till now it's been assumed that elements without IDs cannot be referenced from elsewhere, even indirectly. Tools/CMSs that restrict their indexing to elements with IDs (for instance, to track backreferences) will need to rethink those assumptions. In that respect, #12013 is more than mere "shorthand", because it provides something that an authoring tool can't just fake by munging XML.
> Hoping that this is clearing up some of the questions,
Absolutely. Hoping, too, that this is helping to make a spec that implementations can and will do correctly. > Yas > > > From: Deborah_Pickett@moldflow.com [mailto:Deborah_Pickett@moldflow.com] > Sent: Tuesday, August 07, 2007 7:11 PM > To: Yas Etessam > Cc: dita@lists.oasis-open.org > Subject: Re: [dita] Fw: [dita-comment] #12013 Referencing a range of elements > > > My comments below... > > -- > Deborah Pickett > Information Architect, Moldflow Corporation, Melbourne > Deborah_Pickett@moldflow.com > > "Yas Etessam" <yas.etessam@justsystems.com> wrote on 08/08/2007 11:03:40 AM: > > > Hello Deborah, > > > > Thank you for your comments, please see responses below. > > > > Yas > > > > > > > The proposal doesn't speak of what happens when filtering removes one > > > or > > > both ends of the range. Is this invalid? Or is it valid, with the > > > intervening unfiltered elements remaining? Is it still valid if all > > > elements are filtered out? > > According to this diagram that shows the DOT's pre-processing > > architecture, > > http://dita-ot.sourceforge.net/SourceForgeFiles/doc/DITA-preprocessarch. > > html, filtering will happen before conrefs are resolved. > > > > If the start/end markers are filtered out, there is nothing to pull in. > > > > The situation would be the same if a pair of matched start/end markers > > are orphaned because of filtering. There is no way to pull in the > > intervening (unfiltered elements) if a start or end markers is been > > filtered out. > > That's a problem. It breaks a use case that I expect to see a lot: > pulling in a series of steps that contain filtering for different products. > > Example: > > <task id="sharedsteps"> > ... > <steps> > <step id="first" product="one"><cmd>Flush all widgets.</cmd></step> > <step><cmd>Open a new widget.</cmd></step> > <step product="two"><cmd>Set advanced options. > </cmd><substeps>...</substeps></step> > <step id="last"><cmd>Press OK to continue.</cmd></step> > </steps> > ... > </task> > > Pulling this series of steps thus: > <steps> > <step conref="shared.xml#sharedsteps/first" conreftype="start"/> > <step conref="shared.xml#sharedsteps/last" conreftype="end"/> > <step><cmd>Now go do something else.</cmd></step> > </steps> > will work only if product "one" is not filtered out. > > I think that this is going to astonish a lot of users. I don't want > to have to respond to their question "why doesn't it work" with > "that's how we designed it". > > I see three choices for a resolution: > - Conref resolution must be done before filtering, and DITA-OT is > currently off spec. Then this example would get the useful > behaviour of filtering the sequence of steps after they are pulled > into existence. > - We state that the proposal isn't intended to cover this use case. > In big letters, because users will expect it to work irrespective > of filtering. > - The proposal needs further work to cover this use case. > > > After conferring with Rob Anderson offline, bringing in nodes as part of > > the intermediary range is fine. > > Good, thanks, that makes me happy. > > > > What happens when conrefs are chained? [...] > > If you have a regular conref and it points to an element that has a > > start conref, the user's intention is probably to pull in that single > > element not the range. > > This needs to be specified, not left to a default "probable user's > intention". Particularly since normal conref processing might pick > up the conreftype="start" attribute and leave you with a dangling > start without end. It may be safest to simply forbid conref > chaining where the types don't match. If we find a use for it > later, re-allow it. > > > > I worry about the onus that the author of the reusable content has in > > > marking the start/end elements *and not messing with them later*. > > That > > > presents a level of coupling that standard conref doesn't have. In > > > particular, the common use case of "pull all of the steps in a task, > > > then add some of my own" has more overhead than it should. (Think of > > a > > > task with only one step (and so only one id on the step). Others pull > > > this singleton step with a range conref, with start/end ids the same. > > > I > > > then add a second step, and give it a new id. But all the range > > > conrefs > > > pulling my steps don't know about this, because they only know of the > > > one id, of my original step.) > > I particularly wanted to hear people's reaction to this one. Here's > an example spelling it out. > > With one step: > <task id="sharedsteps"> > ... > <steps id="allsteps"> > <step id="first"><cmd>Do this thing.</cmd></step> > </steps> > ... > </task> > > Pulled into this file which someone else is maintaining: > <steps> > <step conref="shared.xml#sharedsteps/first" conreftype="start"></step> > <step conref="shared.xml#sharedsteps/first" conreftype="end"></step> > <step><cmd>Now do some other stuff.</cmd></step> > </steps> > > All fine, until I add a second step: > <task id="sharedsteps"> > ... > <steps> > <step id="first"><cmd>Do this thing.</cmd></step> > <step id="last"><cmd>I forgot, do this thing too.</cmd></step> > </steps> > ... > </task> > > All of the tasks which pull my list of steps need to be updated to > point to sharedsteps/last. This amounts to finding backreferences, > which I am not at all comfortable with. Not every user has a cosy > CMS to handle that for them automatically. > > It's the transition from one thing to two things that is the issue > here. Even moving to a one-element/two-conref-attribute model won't > solve this one. > > Thinking out loud, how about something like: > <steps> > <step conref="shared.xml#sharedsteps/allsteps" conreftype="allchildren"/> > <step><cmd>Now do some other stuff.</cmd></step> > </steps> > The elements would all have to be the same type, but here they > happen to be . . . > > > > And now the big one . . . I need to be convinced that the > > > two-adjacent-element conreftype=start/end markup is better than the > > > alternative one-element-with-two-conref-attributes markup. > [...] > > As the conref feature is already implemented in many tools and CMS > > systems, the addition of a single modifier to enable this form of > > short-hand conref is a simpler change. > > Yes, I imagine that the changes for maintainers of tools, editors > and CMSs will probably prefer that over rehashing their conref- > resolution code. It still feels to me like a case of the tail > wagging the dog, but I won't argue it any more; I've made my opinion > clear. But I'd still like to know your (and others') thoughts on > the above two examples. > > Oh, one more thing. Explicitly naming the first and last elements > at the pulling end leaves room for there to be some generalizing- > during-conref on those elements (e.g., you pull in a with <ph > conref="...">). What about the intervening elements? Generalize as > little as needed to make them fit? Is that information available > during conref? Or forbid generalization-during-conref for ranges?

Follow-Ups:
- RE: [dita] Fw: [dita-comment] #12013 Referencing a range of elements
 - From: Robert D Anderson <robander@us.ibm.com>

References:
- RE: [dita] Fw: [dita-comment] #12013 Referencing a range of elements
 - From: "Yas Etessam" <yas.etessam@justsystems.com>