[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Updated issue list
Decisions made on the conference call are reflected. Issues raised by Murata-san are added. Norm, would you update the committee's webpage with this updated issue list, please? regards, -- Kohsuke KAWAGUCHI +1 650 786 0721 Sun Microsystems firstname.lastname@example.org
issues.xmlTitle: RELAX NG Issues List
RELAX NG issue list
Voted unanimously to resolve this issue by allowing elements with no declared children tp have whitespace.
"except" and "butNot" by jjc. TC is open to other suggestions.
TC's general feeling was, there is no good reason to change it. On 5/31/2001, TC has decided to close this issue without any action until someone bring up a new material.
In 2001/06/28, TC decided that RELAX NG processors should honor xml:base.
the original post suggests to introduce syntax sugars to match frequently used wildcard patterns. Namely,
Some concerns that whether this was important enough to be worth a special syntactic abbreviation. No conclusion was reached.
On 5/31/2001, TC decided to drop this feature for ver.1. Reasons that mentioned are (1) it's not a good practice so it's good to keep it slightly cumbersome, (2) we don't lose the expressiveness of RELAX NG, (3) it's less frequently used, (4) and adding it at the later moment is easier than dropping it at the later moment.
jjc suggests form="prefixed|unprefixed" or form="qualified|unqualified" (the same as XML Schema).
Just as a possibility, jjc also mentioned to split <attribute> to two different elements, just like we did for the parent attribute of the <ref> element.
On 5/31/2001, TC decided to close this issue with no-action-required.
Murata-san reported that RELAX Namespace will use <framework> for its root element. Therefore, RELAX NG will keep using the name <grammar>.
This issue is merged into the "datatype and identity constraint" issue.
Decided to introduce <oneOrMoreToken> and <zeroOrMoreToken> patterns to produce list.
(a) Put a version in the RELAX NG namespace URI (by jjc)
(b) Use a version attribute on the root element (by jjc)
Input from Eric van der Vlist obtains:
He suggests to have both (a) and (b)
Input from John Cowan obtains:
He suggests to use FPI (kk: kind of URN?)
In the 5/3 telecon, we've decided to use a proposal (a). See the detail of this proposal
Input from Josh Lubell obtains:
He wants to have this because of his own experiences
Input from Kohsuke Kawaguchi obtains:
He suggests that this feature can be provided outside of the core spec (as a pre-processor like tool).
TC is still not convinved whether this feature is imporant enough to be added.
The decision is made (but tentatively) to drop this feature from version 1.
James Clark proposed to change attribute name to more sutaible one, but none is suggested by anyone.
John Cowan suggests adding optional "grammar" attribute to "ref" element and thereby introducing the ability to refer to any ancestor grammar.
On 5/31/2001, TC decided to use <parentRef name="..."/> instead of <ref name="..." parent="true"/>. So now we have <ref>,<externalRef>, and <parentRef>.
A comment from Jeni Tennison has re-opened this issue. Here is a quote from his post to relax-ng-comments:
I'd like to be able to restrict the contents of the xs:restriction element based on its base attribute, but only for certain values. So if its base attribute is "xs:decimal" then it could contain xs:totalDigits but if it's "xs:string" then it can't, and so on. However, if it's not a name in the 'xs' namespace, then I want to allow any content. Best would be to exclude all QName values with a particular namespace, but XMLSchema-datatypes doesn't offer that as a facet (and therefore RELAX NG doesn't have it as a parameter). I don't want to use the 'pattern' parameter to test the names because that would undermine the namespace awareness of the schema. I thought I could create a pattern that didn't match particular enumerated values, but I can't find a way to do so. What I'm looking for is something like: <choice> <group> <attribute name="foo"> <value>bar</value> </attribute> <element name="bar"><empty /></element> </group> <group> <attribute name="foo"> <!-- not current RELAX NG --> <not> <value>bar</value> </not> <!-- /not current RELAX NG --> </attribute> <element name="baz"><empty /></element> </group> </choice> This use case would imply that the difference element would be useful outside name classes as well.
It is easy to implement validators if the use of <difference> p1 p2 </difference> is limited in such a way that both p1 and p2 matches one token and one token only (e.g., <data>, <value>, choices of those things.)
So it looks like a good advancement in the functionality with a very small cost. Do we want to have <difference> in this restricted fashion?
Also, if we add <difference>, then we can (or should) also allow <not> P </not> as a syntax sugar of
<difference><data type="string"/> P </difference>
TC has voted not to adopt the functionality (the original proposal of allowing <difference> to any patterns) for ver.1.0. But it is re-opened now.
Various people sugges various names (including, but not limited to, TRELAX, TryRELAX, RELAXED, RELEX, REFLEX, RELAX XML Schema, TREELAX, RELAX 2, EXLAX, etc, etc.
One of the concern is whether we should include "XML Schema" in the name.
Update(May,3rd): jjc suggests "RELAX something" for various reasons (see minutes of May,3rd telecon). In response, "RELAX NG" (next generation, I guess) and RELAX++ are proposed. Other post-fixes are welcome.
Names suggested after the telecon includes URELAX, iRELAX, and TRELAX. The editor feels that RELAX NG establishes some degree of popularity.
We will use "RELAX NG" and its pronunciation will be "relaxing."
The current spec already has several restrictions that prevents problematic situations.
Some argues that the current restrictions still have something to be desired.
TC decided to adopt the restriction proposed in the post of James Clark (2001/06/21).
Currently, RELAX NG allows patterns like
<attribute name="foo"><attribute name="..." /></attribute>
<attribute name="foo"><element name="..." /></attribute>
Should we explicitly prohibits them?(original posts   ).
Those malformed patterns cannot accept anything: any RELAX NG processors can safely replace those malformed patterns by <notAllowed /> without changing semantics.
So at least it doesn't confuse processors.
Murata-san suggests to "implementations SHOULD issue a warning" for a pattern that matches the following condition: that is, <attribute> pattern that "directly or indirectly contain other <attribute> or <element> patterns" after the normalization.
In the conference call of 2001/06/14, we voted to make this situation as an error that must be reported by processors. One of the reasons was the lack of good use cases that make use of <attribute>/<element> in <attribute>
RELAX NG pattern is currently sensitive to the order of <define> element or order of <include> element because of the redefinition capability.
However, this sensitivity can be removed by restricting redefinition to only under <include> element (like XML Schema). But this restriction also limits the expressiveness of RELAX NG.
Should we introduce this restriction to make RELAX NG pattern order-insensitive language? Is this worth the cost of limiting language expressiveness?( original posts )
One of the touchstone will be XHTML modularization. kk wrote that the proposal #1 does not work and #2 does with XHTML m12n.
The proposal #3 with its amendment is adopted.
RELAX NG allows patterns like
<element name="joe"> <attribute name="foo"> ... </attribtue> <attribute name="foo"> ... </attribtue> </element>
Can we prohibit patterns like this? If so, how can we do that?( the original post )
The GNF normalization can detect such a condition. However, since it is a time-consuming operation, it may not be suitable to mandate the enforcement of this constraint.
Murata-san proposed that "implementations MAY issue warning messages" by using the GNF normalization.
Jjc proposed that for every <group> p1 p2 </group>, "the set of possible attribute names occuring in p1 must be disjoint from those occuring in p2." (The present author believes that the same restriction is necessary for <interleave/>.)
Another proposal made by M-san introduces a new primitive <multipleAttributes> NC P </multipleAttribuets> that has the built-in "zero-or-more" semantics.
XML Schema Part 2 has capability to
But our language is also capable of doing above three.
So if we use XML Schema Part 2 as the only datatype vocabulary, we should consider dropping some of the redundant capability. (That is, restricting choices of <data>s, for example).( the original post )
jjc suggests to close this with "no-action required" because he wants to keep a distance from XML Schema Part 2.
We decided not to use the syntax of W3C XML Schema Part 2 for defining datatypes. Therefore, the overlap no longer exists.
We currently allow <grammar> elements to be nested. That is, grammar can be used just like any other patterns.
Murata-san wants to prohibit this because it may interfere with future namespace-based modularization (as currently seen in RELAX and XML Schema).( the original post )
"Namespace-based modularization" means that one module is responsible for one namespace. In my personal opinion (and probably Murata-san's), this is vital for multi-lingual validation, where multiple schema languages cooperates to validate one document.
Murata-san said he is willing to retract this if someone can convince him that nested grammar doesn't possibly interfere with such modularization.
Murata-san retracted his objection in 2001/6/7. This issue was then resolved as no-action-required.
This issue is arose by merging several issues.
The first objective was to introduce the identity constraint functionality in our new language. Then we've found that this issue is related to how our language treats datatypes.
Those posts are about possible features, but how those requirements affect the design is generally unclear. The editor believes that one thing that has developed in telcon is that we don't need any path expression if we abandon multipart keys.
In 5/3 telcon, we've made some degree of consensus about the above requirements (see minutes).
After the telcon, jjc posts his two proposals.
So now it is discovered that without greater involvement to datatypes, we can't use anonymous (or user-defined) types in key/keyref. This discovery leads to another proposal from jjc.
"datatypes #1". This post proposes how to declare new datatypes under the control of our language and how to declare key/keyref constraint.
What's important here is "under the control of our language". RELAX NG allows datatype library(DTLIB) to use its own syntax to declare new types. But in this proposal, every DTLIB is required to use the syntax of this proposal (to make type equivalence test possible).
Datatypes #2 ("the proposal of the day"). Roughly speaking, this is a simplified version of "datatypes #1", which "I(jjc) hope will be able to command consensus."
The difference with the previous proposal is that this one doesn't have the concept of "derivation". That means you can't add facets to your type once you defined it.
Kohsuke KAWAGUCHI also proposes the most simple version.
"Back to the basic" proposal. This one tries to mimic DTD's ID/IDREF capability.
The above "datatypes #2" proposal is adopted. We use the following syntax to define enumeration:
<choice> <token type="xsd:integer"> 5 </token> <token type="xsd:integer"> 2 </token> </choice>
And the following syntax to define a datatype:
<data type="xsd:integer"> <param name="minInclusive"> 5 </param> <param name="maxExclusive"> 8 </param> </data>
For many other details, see the minutes of the conference call. (Not available at this moment.)
We need a namespace URI for the new language.
jjc suggests "http://relaxng.org/ns/m.n" where m.n is the version number.
M-san suggets "http://relaxng.org/ns/something/m.n" so that we can accommodate related namespace URIs. For "something", jjc suggests "structure".
In the 5/31/2001 conference call, several persons speak in favor of HTTP-based URI.
Also, "core" is proposed along with "main" and "structure".
http://relaxng.org/ns/structure/0.9 is choosen.
TREX allows the following pattern.
<oneOrMore> <element> ... </element> <attribute> ... </attribute> </oneOrMore>
In the computer science terminology, this is beyond the power of the "regular language". And therefore problematic for applications.
Shall we avoid this excessive expressiveness? If so, how?
In the above post, M-san suggests to restrict <oneOrMore> and <zeroOrMore> to either
jjc proposes the following restriction: "If a <oneOrMore> element has an <attribute> descendant, it must not have a <group> or <interleave> descendant."
KK suggests the following restriction: "If <group> or <interleave> is used under <oneOrMore>, then it cannot contain any <attribute>."
The TC decided to adopt the restriction by KK.
Eric van der Vlist (email@example.com)Status: resolved
RELAX NG uses QName to
But for some, use of QNames in this way is something they want to avoid.
Can we avoid using QNames? If so, shall we avoid using QNames? If so, how?
Eric proposed to declare the prefix-URI mappings in another independent way, as follows:
<namespace prefix="e" uri="http://www.w3.org/2001/XMLSchema-datatype"/> <data type="e:integer"/>
Here is the original post.
The editor believes Eric also had an alternative proposal, which write URIs every time, like this:
<data namespace="http://www.w3.org/2001/XMLSchema-datatype" type="integer"/>
jjc suggests to introduce a new attribute "datatypeNamespace", which propagates like the "ns" attribute. This proposal will address the problem of using QNames for datatypes.
Murata-san opposes the use of QNames and proposes the following alternative solutions.
Another proposal made by jjc utilizes <div> elements to declare namespaces.
Several more proposals were also made. See the thread starting from here for details.
Some people (including the present editor, for the full disclosure) don't like to use QName in values for some reasons, including:
On the other hand, QName is easier to write for humans, and less verbose. And some people think the use of QNames is unavoidable.
Also, TC seems to have a consensus that RELAX NG grammar should be able to be written without using XML Namespace, if the author prefers so (because for many people XML namespace is still a new technology.)
In 2001/06/07, TC has voted to retain the current status; that is, use @ns to specify the default namespace and allow element/attribute names to have QNames.
<oneOrMoreToken> and <zeroOrMoreToken> are adopted to make lists of strings possible. But it is discovered that a new pattern, namely <list>, can clone the semantics of <***OrMoreToken> patterns and simplify both implementations and the spec.
The above original post contains the semantics of <list> pattern.
<***OrMoreToken> patterns do not allow us to have something like "a list of 4 integers", which is possible under W3C XML Schema. This proposal makes the list capability of RELAX NG more expressive than W3C XML Schema.
On 5/31/2001, TC has decided to adopt this proposal and removes <oneOrMoreToken> and <zeroOrMoreToken>.
Currently, the symbol spaces of key/keyref are global. So two independently-authored grammars may accidentaly use the same key name. Is there any solution to this?
Sometimes, an author wants to refer to keys that another author wrote. That makes restriction difficult.
TC is now waiting for the public comments.
TC has decided to adopt the proposal from James Clark, which extends symbol space name to QName.
Should it be OK to have a redefinition (<define> inside <include>) when that redefined pattern is not defined in the included file?
Consider the following example:
A.rng <grammar> <include href="B.rng"/> <include href="C.rng"> <define name="foo"> ... </define> </include> B.rng <grammar> <define name="foo"> ... </define> </grammar> C.rng: <grammar/>
(Quoting from jjc's post) "If the user has done this, then they have probably made a mistake. On the other hand the semantics are clear. We can either make this an error or suggest that implementations give a warning."
TC has decided that such a redefinition has to be rejected. And an algorithm to detect this situation is available at the thread strating from here.
<grammar> <define name="foo"> ... </define> <define name="foo" combine="choice"> ... </define> </grammar>
Shall RELAX NG keep the same restriction, or not?
Quoting from jjc's post: " I found myself confused by this when reading RELAX grammars. I would find a reference to a label foo, then look for an elementRule foo; when I found it, I would assume it's the only definition. My assumption would be incorrect, and I would therefore misunderstand the schema (though eventually of course I would notice the other definitions and understand correctly). The other side of the argument is that this is an extra complication, and makes things slightly harder to explain. "
TC has decided to allow this (in 2001/6/7 conference call)
It looks problematic to allow <element>s/<attribute>s within a <list> (may be it's not). We may want to restrict patterns that are allowed inside <list>.
KK suggested to prohibit attributes inside list.
TC seems to have a consensus that these should be prohibited.
It is discovered that it is possible to modify the implementation to correctly process elements/attributes inside a list.
Jeni Tennison <firstname.lastname@example.org>Status: resolved
Currently, <not> pattern can only have one operand, and <not>p</not> is considered as the syntax sugar of <difference><anyName/>p</difference>. Ms.Tennison suggests that we can change <not> to have multiple operands by modifying its definition as:
<not>p1 p2 ...</not>
as equivalent of
<difference> <anyName/> <choice> p1 p2 ... </choice> </difference>
This change is relatively easy because <not> is just a syntax sugar and it doesn't affect any other part of the spec. And as Ms.Tennison said, this "would be more convenient".
TC has decided to close this issue with no-action-required (2001/06/21). The reason was that <not> is inherently an unary operator, and semantics might become obscure if we allow multiple operands.
TC will revisit this issue when someone suggests to change the name of <not>.
Jeni Tennison <email@example.com>Status: resolved
Currently, our (conceptual) interface to datatype libraries is defined in such a way that the following pattern matches the following instance.
pattern: <?xml version="1.0"?> <element name="root" xmlns="http://relaxng.org/ns/structure/0.9" ns="http://www.example.org" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <!-- note that the value is unqualified --> <value type="QName">foo</value> </element> instance: <root xmlns="http://www.example.org" xmlns:rng="http://relaxng.org/ns/structure/0.9"> rng:foo </root>
Because the unqualified QName value is considered to have the namespace URI of the defefault namespace. In this case, that is the namespace URI of RELAX NG, and this behavior is probably not what people want.
If we modify the spec to resolve unqualified prefix to the value of "ns" attribute, instead of the default namespace URI of the pattern file, then the above pattern would match the following instance:
instance: <root xmlns="http://www.example.org" xmlns:rng="http://relaxng.org/ns/structure/0.9"> foo </root>
The other resolution would be simply to close this issue without no action required.
In 2001/06/21, TC decided that the default namespace should come from the ns attribute, rather than the xmlns default namespace.
Allowing <list> doesn't buy anything. Should we prohibit that?
In 2001/06/21 conference call, we decided to prohibit them as an compile-time error.
Here is the original post.
Although there are several algorithms for manipulating name classes, the name class, especially <difference>, make it look compilicated. And even if it only look complicated, it still discourage implementors.
However, it is possible to change the primitives to make it look simple, while still preserving the same expressiveness.
Here is the proposed new syntax.
name-class ::= name-class-literal | <t:anyName/> | <t:anyNsName ns="namespaceURI"/> | <t:anyNsNameExcept> name-class-literal+ </t:anyNsNameExcept> | <t:anyNameExcept> (name-class-literal | <t:anyNsName ns="namespaceURI"/>)+ </t:anyNameExcept> | <t:choice> name-class name-class </t:choice> name-class-literal ::= <t:name ns="namespaceURI"> NCName </t:name>
Here is the original post.
Generally speaking, <interleave> is hard to implement. It CAN be implemented, but it is difficult. Specifically, <interleave> makes it difficult to (1) perform validation by statically-compiled automaton, (2) perform subtyping algorithms, and etc.
So some degree of restrictions are desirable (at least for several people). If so, what restriction shall we employ?
The original-post contains a proposal. But it still doesn't enjoy the consensus of TC.
Here is the original post.
If we adopt the pattern facet to <text/> and <mixed/>, then we will be relying on W3C XML Schema Part 2.
James suggested that the pattern facet is not the right thing for this purpose.
TC is looking for real world use cases for this feature.
A public comment from Franck Delahaye suggests that the current syntax of specifying key/keyref by attributes becomes clumsy sometimes. See his post for the actual example.
James suggested to introduce dedicated elements <key> and <keyref> for this purpose.
Powered by eList eXpress LLC