[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Datatype and identity constraints proposal of the day (17 May)
Here's an attempt at a simple proposal for dealing with datatypes and identity constraints in RELAX NG, which I hope will be able to command consensus. This doesn't allow a user to define a datatype in a grammar and then derive a datatype from this by adding parameters. Although this may be inconvenient in a few cases it doesn't reduce the expressive power of RELAX NG. Having experimented with designs to support this, I think the small gain in convenience is vastly outweighed by the increase in complexity. If an implementation/user community really needs to support this, they can do so by using QNames referring to types defined in external XML Schema (or even inline XML Schema) or using some other external mechanism: RELAX NG doesn't have to restrict the QNames of types to be in the XML Schema datatypes namespace (though that's what I expect most implementations will support). External Datatyping Interface --------------------------------------- A _datatype library_ is identified by a namespace URI. A datatype library maps a local name (NCName) to a _datatype_. A datatype provides two operations. a) An allows operation. This returns true or false. It takes the following arguments: 1. the string to be checked 2. the context for the string to be checked 3. a parameter set A context is a set of in-scope namespaces (plus entity declarations?). A parameter set is a mapping from NCNames to strings. b) An equivalent operation. This returns true or false. It takes the following arguments: 1. the first string to be compared. The allows operation must return true for this string with an empty parameter set. 2. the context for the first string 3. the second string to be compared. The allows operation must return true for this string with an empty parameter set. 4. the context for the second string Changes relative to TREX ------------------------------------ The changes relative to TREX would be as follows: 1. Anonymous datatypes are removed. 2. A <data> element may have zero or more <param> elements. A <param> element has a name attribute which is a NCName. The content is a string specifying the value of the parameter. A <data> element must not have two <param> elements with the same name. The XML Schema Part 2 whiteSpace facet would not be acceptable as a parameter (see http://lists.oasis-open.org/archives/trex/200105/msg00146.html). 3. A <data> element may optionally have either a key or a keyRef attribute. The value is an NCName. It is an error to have two <data> elements X, Y such that X has a key or keyRef attribute equal to K, Y has a key or keyRef attribute also equal to K, but the type attributes of X and Y do not refer to the same namespace URI and same local name. The use of key/keyRef attributes would be subject to the unambiguity constraint described in http://lists.oasis-open.org/archives/trex/200105/msg00058.html http://lists.oasis-open.org/archives/trex/200105/msg00069.html http://lists.oasis-open.org/archives/trex/200105/msg00071.html 4. Two elements <oneOrMoreTokens> and <zeroOrMoreTokens> are added (see http://lists.oasis-open.org/archives/trex/200105/msg00145.html). 5. The <string> element is removed. There are several alternatives for replacing it: a) Add an <enumeration> element. This has a type attribute identifying a datatype and a sequence of one or more value elements identifying the allowed value. The equivalent operation associated with the named datatype is used for comparing strings against the value. <enumeration type="xsd:token"> <value>foo</value> <value>bar</value> </enumeration> (Note that since parameters don't affect the equivalence relation, there is no need for <enumeration> to allow <param> children.) b) Add a <value> element. This has a type attribute identifying a datatype. The content of the element specifies the value. A forest matches a <value> pattern if it is a string which is equivalent to the specified value. To specify a choice of constants (ie an enumeration) you would use <choice>. <choice> <value type="xsd:token">foo</value> <value type="xsd:token">bar</value> </choice> c) Put a <choice> of <value>s inside a <data> element>. <data type="xsd:token"> <choice> <value>foo</value> <value>bar</value> </choice> </data> d) Put <value> element inside <data> directly: <data type="xsd:token"> <value>foo</value> <value>bar</value> </data> 6. Add two builtin datatypes to TREX: string and token. These are to allow enumerations and identity constraints to be useable without any external system of datatypes. 7. Rename <anyString/> to <text/>. Issues --------- 1. Which solution should be adopt for enumerations? 2. Should it be possible for parameter values to be context dependent? If so, a parameter would be modelled as a <NCName, string, context> triple instead of a <NCName, string> pair. 3. Should the value of a <param> be specified in the content or by with a value attribute? 4. If we go for <enumeration>, should it allow a key/keyRef attribute? 5. Should the name of a key or keyRef be scoped/qualified in some way? James
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC