OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [dita] Conref range: Is constraint on last member of rangenecessary or useful?

I think you have misunderstood when a content model is or is not ambiguous.

In the case of the sequence (a,b,a,c,a,d), given an initial <a> element
there is no question that it must match the first "a" token in the content
model, then you must have a b, then another a, and so on. Because there are
no choices here, the content model is deterministic (not ambiguous).

But, in any case, XML does not *require* reporting of non-deterministic
content models: "XML processors built using SGML systems *may* flag
non-deterministic content models as errors." (Emphasis mine.)

In point of fact, I don't believe any XML parsers in common use report
non-deterministic models, at least by default, simply because there is no
need to do so because they can all validate against such models just fine.

I tried the experiment of making my sample ambiguous, e.g., by changing it

<!ELEMENT foo ((a, b) | (a, c)) >

And Xerces (through Oxygen) correctly reported my document as invalid
(because it no longer satisfied the content model) but it did not complain
about the content model itself.

I tried editing the document with the ambiguous content model in Arbortext
Editor 5.4 because if there's any processor that would report ambiguous
content models I would think Editor would given it's SGML heritage, but it
did not.

So even if a dependence on non-determinism would help in this case (which I
don't see that it actually does) you couldn't rely on parsers reporting it.

But in any case, my original example is correct as written in that the
content model is not non-deterministic, so I think my argument stands:
requiring matching end types in ranges can't prevent non-DTD-valid results
following resolution and therefore there's no point in making the



On 2/10/11 9:21 AM, "Michael Priestley" <mpriestl@ca.ibm.com> wrote:

> Hi Eliot, 
>> Determinism only applies when there is optionality.
> Optionality is merely one way to create indeterminacy.
> From the URL I sent:
>> given an initial b the XML processor cannot know which b in the model is
>> being matched without looking ahead to see which element follows the b.
> That certainly describes your example below.
> Michael Priestley, Senior Technical Staff Member (STSM)
> Lead IBM DITA Architect
> mpriestl@ca.ibm.com
> http://dita.xml.org/blog/25 <http://dita.xml.org/blog/25>
> From: Eliot Kimber <ekimber@reallysi.com>
> To: Michael Priestley/Toronto/IBM@IBMCA
> Cc: dita <dita@lists.oasis-open.org>
> Date: 02/10/2011 10:06 AM
> Subject: Re: [dita] Conref range: Is constraint on last member of range
> necessary or useful?
> Determinism only applies when there is optionality. Consider this DTD:
> <!ELEMENT root (foo) >
> <!ELEMENT foo (a, b, a, c, a, d) >
> <!ELEMENT a (#PCDATA)* >
> <!ELEMENT b (#PCDATA)* >
> <!ELEMENT c (#PCDATA)* >
> <!ELEMENT d (#PCDATA)* >
> And this valid instance:
> <!DOCTYPE root SYSTEM "sequence-test.dtd">
> <root>
>   <foo>
>     <a id="a1"></a>
>     <b id="b1"></b>
>     <a id="a2"></a>
>     <c id="c1"></c>
>     <a id="a2"></a>
>     <c id="c1"></c>
>     <a id="a3"></a>
>     <d id="d1"></d>
>   </foo>
> </root>
> The element type <a> is allowed in three places, once followed by <b>, once
> followed by <c>, once by <d>. From a referencing topic I could do this:
> <foo>
>   <a conkeyref="sequence-test.dtd/a2"
>      conrefend="x#x/a3"
>  />
>  <b/><a/><c/><a/><d/>
> </foo>
> The referenced range is not DTD valid in the referencing context (and in
> fact in this example there is no possible referenced range that would be DTD
> valid since the content model has no option members).
> Since creating invalid reference results is not (and cannot be) disallowed,
> it must be allowed. Requiring that the last member be <a> in this case
> doesn't make the result any *more* valid nor does it make it less valid.
> But the reference is correct per the conref constraints.
> Again, in this example, why should I be disallowed from referencing the
> sequence <a/><b/>?
> So while this may not be a likely case it is a possible case and it
> demonstrates that imposing a requirement on the last member of the sequence
> doesn't help ensure sensibility or DTD validity of the result.
> In practice you would expect to only use conref range in the context of
> parent elements with repeating OR groups but that itself is not a stated
> requirement of the facility in DITA 1.2. But even in that case, requiring a
> specific sequence end doesn't make much sense since in the case of repeating
> OR groups all valid members of the group will always be valid wherever they
> occur, so again, requiring a specific end element doesn't appear to help
> (because validity is guaranteed in the repeating OR group case).
> We can break the possibilities down as follows:
> 1. Parent content model is a repeating OR group: no possible sequence of
> siblings that satisfy general conref constraints can be invalid. No need to
> constrain any node of sequence. Validity of referenced result is ensured by
> normal same-or-more-specialized type requirements on referenced elements
> (including parent of referenced elements).
> 2. Parent content model is a sequence where first item in the referenced
> sequence is required wherever it occurs (example shown above). In this case,
> validity of the referenced result cannot be guaranteed in any case.
> Constraining the last node of the sequence cannot help, as shown above.
> 3. Parent content model is a sequence of elements where first member of the
> referenced sequence is optional in some cases. Determinism rules disallow
> construction of na´ve content models but any non-deterministic content model
> can be rewritten as a sequence of sequences that reflect all possible
> combinations of elements. Therefore this case resolves to case 2. Again,
> constraint of last member cannot help.
> So I don't see how the last member constraint can ever help and I can think
> of cases where it gets in the way. Thus it appears to be an unnecessary
> requirement that requires content model design that wouldn't otherwise be
> required and that does not satisfy the intended goal, namely ensuring
> sensibility of the conref result.
> Cheers,
> E.
> On 2/10/11 7:25 AM, "Michael Priestley" <mpriestl@ca.ibm.com> wrote:
>> Hi Eliot, 
>> Re: 
>>> 2. It is possible to define sequence content models that allow a given type
>>> to occur in multiple places within the sequence but that allows different
>>> following siblings.
>> I don't believe that's true. See:
>> http://www.w3.org/TR/2000/REC-xml-20001006#determinism
>> <http://www.w3.org/TR/2000/REC-xml-20001006#determinism>
>> <http://www.w3.org/TR/2000/REC-xml-20001006#determinism
>> <http://www.w3.org/TR/2000/REC-xml-20001006#determinism> >
>> The current design requires matching start/end elements explicitly to
>> leverage
>> determinism. 
>> Michael Priestley, Senior Technical Staff Member (STSM)
>> Lead IBM DITA Architect
>> mpriestl@ca.ibm.com
>> http://dita.xml.org/blog/25 <http://dita.xml.org/blog/25>
>> <http://dita.xml.org/blog/25 <http://dita.xml.org/blog/25> >
>> From: Eliot Kimber <ekimber@reallysi.com>
>> To: dita <dita@lists.oasis-open.org>
>> Date: 02/10/2011 08:03 AM
>> Subject: [dita] Conref range: Is constraint on last member of range necessary
>> or useful? 
>> I'm writing up my explanation of conref range for my book and in explaining
>> the rule that the first and last elements of the range must be the same type
>> but intermediate members need not not be, it occurs to me that there's
>> really no point in having the constraint on the last member of the range.
>> Since I obviously didn't think about this too much at the time the mechanism
>> was proposed, I'm wondering if there was more thinking behind the constraint
>> than is evident from the language of the spec itself.
>> My questioning of the value of the constraint comes from this analysis:
>> 1. The requirement that the referencing and referenced elements have
>> compatible parent elements ensures that the start element of the range is
>> valid in the referencing context.
>> 2. It is possible to define sequence content models that allow a given type
>> to occur in multiple places within the sequence but that allows different
>> following siblings. This means that the referencing element could refer to a
>> range that is inconsistent with the sequence rules in the referencing
>> context. Since this case is not explicitly disallowed, it must not be a
>> concern. This means that strict DTD validity of the conref result cannot be
>> ensured in the general case and there is no general requirement to ensure
>> it.
>> Likewise, since there are not constraints on the intermediate members beyond
>> common parentage, there must be no general concern about DTD validity of the
>> resolved result.
>> 3. Given (2) it can't possibly help to require the last member of a sequence
>> to be the same as the start since it cannot make the result more valid.
>> 4. Requiring that the start and end of the range be the same disallows use
>> of conref range for referencing sequences where the content model does not
>> allow the initial type to occur at the end of the range.
>> For example, say you have a specialized topic type that defines a set of
>> distinct specializations of <section> and puts them in a specific order. It
>> would be impossible to use conref range to re-use the sequence of sections
>> from another topic of the same type even though the result must be DTD
>> valid.
>> Thus, the requirement seems to be both unnecessary (it doesn't help ensure
>> correctness or sensibility of the conref result) and it disallows legitimate
>> cases.
>> Perhaps for 1.3 we should consider removing this constraint.
>> Cheers,
>> E.

Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 512.554.9368

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]