relax-ng message

Subject: key/keyRef: generalization, ad-hocness,and generalization of ad-hocness

From: Murata Makoto <mura034@attglobal.net>
To: relax-ng@lists.oasis-open.org
Date: Sat, 28 Jul 2001 00:26:29 +0900

Summary: I like truly-generalized mechanisms or simple ad-hoc
mechanisms.  The current specification generalizes ad-hoc mechanisms 
considerably.  I disagree.


James and I agree that a truly generalized solution for identity
constraints is based on path expressions.  Such a solution would cover
multi-part keys, scoped keys, and composite-element keys (e.g.,
<name>Murata<ruby>murata</ruby></name)).  Furthermore, indenty
constraints are likely to be separated from grammars.

However, James feels that such identity constraints will require more
effors and deserve an independent specification.  James proposes that
we introduce some small mechanism as part of RELAX NG.  It should be
simple and better than ID, IDREF, IDREFS of DTDs.  I totally agree.
The TC has also agreed that we need typed keys (e.g., each record is
identified by an integer).

However, the current specification goes far beyond ID, IDREF, IDREFS 
in a number of points.

  (1) paths of arbitrary lengths (e.g., paragarphs in
    chapters/sections/subsections are identified by numbers, 
    but paragaraphs in column articles are not).

  (2) key/keyRef declarations can be combined with <choice> (see the 
     production rule for the non-terminal "value")

  (3) key/keyRef declarations can be combined with <list> (again, see the 
     production rule for the non-terminal "value")

  (4) key/keyRef declarations within <list> can be freely combined
     with <oneOrMore>, <interleave>, <group>, and <choice> (see the
     production rule for the non-terminal "tokens")

I think that (1) is a back-handed mechanism for path expressions.  A
truly generalized approach is sketched by [1] and is presented in my
PDOS 2001 paper.  I think that it is very ad-hoc to mimick path
expressions by tree grammars.  I propose to have an issue in our issue
list.  I am not sure if the TC has officially made any decision about this.

[1] http://lists.oasis-open.org/archives/relax-ng/200107/msg00089.html

I also think that (2) is a doubtful generalization.  I believe that
the TC has never discussed about this and this part of the current
spec is not status quo.  I think that this should automatically become
an issue since no consensus has ever existed.

In my understanding, the semantic rule (keyAmbig) disallows almost all
<choice> elements containing <key> or <keyRef>.  It allows <choice>p1
p2</choice>, only if p1 and p2 have exactly the same key-types.  Do
people see any values in allowing

<choice>
  <key name="foo">
    <data type="short"><param  name="minInclusive">100</param></data>
  </key>
  <key name="foo">
    <value type="short">-1</value>
  </key>
</choice>

?  I do not think that this feature deserves the complexity of the
spec.

Issue 56 includes (3).  I think that (3) is reasonable for <keyRef>
but unreasonable for <key>.  Don't be deceived by superfitial
symmetricity.  If an element or attribute contains a list of keyRefs
(i.e., IDREFS), we conceptually have multiple reference objects.  The
rest of the story is simple.  However, if an element or attribute
contains a list of key, we destroy one-to-one correspondences between
identifiers and elements/attributes and go far beyond ID/IDREF or
identity constraints studied in the database community.

I think that (4) is really bad.  I believe that the TC has never
discussed about this and that this part of the spec is not status quo.  
I think that this should automatically become an issue since no 
consensus has ever existed.

Although the BNF in 7.1 appears to allow so many possibilities, the
semantic rule (keyAmbig) actually imposes a severe restriction.  For
example, we can have <list><group>p1 p2 </group></list>, only if p1 and
p2 have exactly the same key-types.

>I know you don't like 7.3, but do you really not understand it?  If you
>don't understand, please say which inference rules you find hard to
>understand.

I believe I understand 75%.  However, to really understand (4), I have
to understand the BNF in 7.1 as well as semantic rules in 7.4.  At
first, I first thought that semantic rules for "contain" can be
replaced with some prose, but I was wrong.  The interaction between 
semantic rules is more complicated than I thought.

Cheers,

Makoto

Follow-Ups:
- Re: key/keyRef: generalization, ad-hocness,and generalization of ad-hocness
  - From: Kohsuke KAWAGUCHI <kohsuke.kawaguchi@eng.sun.com>
- Re: key/keyRef: generalization, ad-hocness,and generalization of ad-hocness
  - From: Kohsuke KAWAGUCHI <kohsuke.kawaguchi@eng.sun.com>