relax-ng message

Subject: Re: Relationship among our data model, patterns, and XML 1.0

From: James Clark <jjc@jclark.com>
To: Murata Makoto <mura034@attglobal.net>, relax-ng@lists.oasis-open.org
Date: Tue, 12 Jun 2001 09:52:27 +0700

I would like to consider this in terms of the formal semantics.

What does it mean, in terms of the formal semantics, for a pattern p to
generate a tree t?  It means simply that the inference rules allow us to
prove a judgment of the form

     valid(t, p)

So the question is: do the inference rules currently allow us to prove

     valid(t, p)

for some t that is not allowed by XML 1.0? The answer is yes. There are 3
rules that cause the problem: (group), (interleave 1) and (oneOrMore 2).  In
all these cases, we do a1 + a2 when there is a possibility that a1 and a2
have attribute names in common.  Clearly we need to fix this.

However, I don't believe there is any need to introduce new syntax.
Murata-san suggested introducing a <multipleAttributes> pattern which
generates a collection of non-colliding attributes. Now we already have a
restriction on using <oneOrMore> and <attribute> such that we can transform
any pattern into a form where whenever <attribute> occurs inside <oneOrMore>
it is the only child of that <oneOrMore>.  Changing the syntax from

  <oneOrMore>
    <attribute>...</>
  </oneOrMore>

to

  <multipleAttributes>...</>

does not buy us anything.  Murata-san is right that we need a pattern that
generates a collection of non-colliding attributes, but there is no reason
why we cannot make oneOrMore have this semantics.  Currently the (oneOrMore
2) rule is:

E; ns |- a1; m1 =~ p => k1; kr1
E; ns |- a2; m2 =~ <oneOrMore> p </oneOrMore> => k2; kr2
--------------------------------------------------------
E; ns |- a1 + a2; m1, m2 =~ <oneOrMore> p </oneOrMore>
         => k1 + k2; kr1 + kr2

All we need to do is to add another antecedent:

E; ns |- a1; m1 =~ p => k1; kr1
E; ns |- a2; m2 =~ <oneOrMore> p </oneOrMore> => k2; kr2
disjoint(a1, a2)
--------------------------------------------------------
E; ns |- a1 + a2; m1, m2 =~ <oneOrMore> p </oneOrMore>
         => k1 + k2; kr1 + kr2


where disjoint(a1, a2) asserts that a1 and a2 do not have any attribute name
s in common.

Now I can answer the question I asked in response to Murata-san's first
proposal: how can it make sense to allow

  <attribute><anyName/></attribute>

or

  <oneOrMore>
    <attribute><anyName/></attribute>
  </oneOrMore>

or

  <zeroOrMore>
    <attribute><anyName/></attribute>
  </zeroOrMore>

but not

  <group>
    <attribute><anyName/></attribute>
    <attribute><anyName/></attribute>
  </group>

?  With the restriction that Murata-san proposed, the (interleave 1) and
(group) rules are no longer a problem.  It would no longer be possible to be
able to use these rules to prove the validity of elements with duplicate
attributes.  However, we would still need the additional antecedent on
(oneOrMore 2).  Thus the semantics of

<oneOrMore>p</oneOrMore>

are NOT equivalent to

<choice>
  p
  <group> p p </group>
  <group> p p p </group>
  ...
</choice>

because the former has the non-collision semantics whereas the latter does
not.  Thus it makes sense to disallow

<group>p p</group>

even though <oneOrMore>p</oneOrMore> is allowed.

The conclusion is that I am happy with Murata-san's first proposal.

James

Follow-Ups:
- Re: Relationship among our data model, patterns, and XML 1.0
  - From: Kohsuke KAWAGUCHI <kohsuke.kawaguchi@eng.sun.com>

References:
- Relationship among our data model, patterns, and XML 1.0
  - From: Murata Makoto <mura034@attglobal.net>