OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

relax-ng message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: [relax-ng] Re: Interleave and text



> I have been trying to figure out whether we need to make it an error if
> <text/> occurs in both operands of an <interleave/>.  So far I haven't had
> much success.  I haven't been able to construct an example that causes Jing
> to blows up nor have I been able to convince myself that it is impossible to
> construct such an example. Help!

I now believe it's OK to have multiple texts in both branches of an
interleave. Here is the informal proof:


The property that we need to preserve here is:

  given an token (element,attribute or character literal), it has to be
  always possible to uniquely determine the patterns that consume the
  token.

Assume the current residual R = <interleave> ... </interleave> and the
current token is a character literal. Say R contains <text/>.

According to the contextual restriction, we don't have <data/> or
<value/> in R.

If R cannot accept this token, then the above property is preserved
This happens for R like

<interleave>
  <group>
    <element name=X/>
    <text/>
  </group>
  <group>
    <element name=Y/>
    <text/>
  </group>
</interleave>

If R can accept this token, then there must be one or more <text/> that
can match it. If there is only one, then again the above property is
preserved, so it's OK.


So the problem is when we have more than one <text/> that can match it.

To see why it's OK, it helps to look at a problematic interleave:

<interleave>
  <group>
    <element X>
    <element Y>
  </group>
  <group>
    <element X>
    <element Z>
  </group>
</interleave>

Why this interleave is a problem? Because when we see X, there are two
ways to consume it, and the residual varies depending on it.

Now in case of <text/>, <text/> has the unique property that it's not be
"consumed". That is, while residual(<element>,ElementToken) = <empty/>,
residual(<text/>,CharacterToken) = <text/>.

So even though there are two ways to consume it, the residual always
remains the same regardless of matched <text/>. So we can always match
the character token to all matchable <text/> and this doesn't change the
semantics.

Q.E.D.

regards,
--
Kohsuke KAWAGUCHI                          +1 650 786 0721
Sun Microsystems                   kohsuke.kawaguchi@sun.com



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC