[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Issue 57
I have been trying to solve issue 57. Problem ------- The spec says whitespace-only strings in an element are stripped before it is validated against the content pattern. This makes the following document <foo> </foo> invalid with respect to this schema <element name="foo"> <data type="string" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <param name="minLength">2</param> </data> </element> Here is another problem. According to the inference rules in the spec, the pattern <element name="foo" xmlns="http://relaxng.org/ns/structure/0.9"> <list> <data type="token"/> <data type="token"/> </list> </element> matches <foo>x</foo> which I think users will find surprising. This is because of the (empty string) rule which says that if a pattern matches the empty string, then it matches the empty sequence. Spec changes ------------ 6.2 (or maybe new 6.2.1) Introduce variable range ws, which is either an empty sequence or a string consisting entirely of whitespace. Introduce variant of =~, call it =~c (for complete match), with the following inference rules: cx |- a, m =~ p ------------------ (complete match 1) cx |- a, m =~c p cx |- a, () =~ p --------------------- (complete match 2) cx |- a, ws =~c p cx |- a, "" =~ p ------------------ (complete match 3) cx |- a, () =~c p So, for example, we have: " " =~c <empty/> () =~c <data type="string"/> not(" " =~c <value type="string"/>) Note that (complete match 3) replaces the old (empty string) rule. (complete match 2) says that the content of an element or attribute that consists of only whitespace matches <empty/>. 6.2.7 Change the (attribute) rule to: cx |- {}; s =~c p n in nc ----------------------------------------------------- cx |- attribute(n, s) =~ <attribute>nc p</attribute> No need for v variable range or toString function. 6.2.8 Change rule (element) to the following: cx1 |- a; m =~c p n in nc okAsChildren(m) deref(ln) = <element> nc p </element> ---------------------------------------------------------------- cx2 |- {}; ws1, element( n, cx1, a, m ), ws2 =~ <ref name="ln"/> No need for stripSpace function. 6.2.9 Get rid of (empty string) rule. Implementation -------------- From an implementation perspective, this means that - <value> and <data> are not "nullable", ie they don't match the empty sequence - if an element or attribute has content that is nothing but whitespace (including no whitespace at all), then it matches an element or attribute pattern iff the name matches and the pattern for the content matches either the empty sequence or a string consisting of that whitespace - a whitespace text node that has a sibling element is ignored Advantages ---------- I think this makes things work in an unsurprising way. It's more expressive than the current spec. For example, if you want something that matches <foo></foo> or <foo/> but not <foo> </foo> (ie more similar to EMPTY in XML 1.0), then you can use <element name="foo"><value type="string"/></element> James
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC