[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Illustration of
Fellow UBL TC members, Please bear with me with this long example below, but during the call last week I detected some disbelief at my assertion that W3C Schema does not allow the simultaneous validation of two or more independent UBL extensions in a single instance with a single pass of the W3C Schema validator. I would *love* to be told that I'm wrong, but I cannot see where and this impacts the "simplicity" of things we are trying to create for our users. I say "simplicity" because when an extension module is published, it should be useable as is, and read-only. If I want two extensions supported, I shouldn't have to create some über-extension-schema with knowledge of all (and even then I don't think W3C Schema can handle the job). The issue has to do with wild cards. The extension point in all UBL schemas allows users to create their own extensions and augment the validation of their invoices (say) and have additional information. Different communities will have different extensions. Any given community will only know if its own extensions. On top of this, the we are about to publish our first "standardized extension" and so users may wish to validate an instance that has two extensions, theirs and the committee's. All extension schemas from committees or communities are read-only, but the "coordinating schema" is not read-only and can be changed to accommodate the importation of the read-only schemas. In all of this, no read-only extension must need to know about other read-only extensions. I have no problems describing a W3C Schema that accommodates one extension, or another schema that accommodates a different extension, but because of ambiguity I cannot seem to write a W3C Schema 1.0 schema that accommodates two extensions where either are allowed as well as something that isn't in either. The long example is below, with many cases I can cover with W3C schema until I reach the very end were I cannot accommodate what I actually need, given this as a starting point: - there is an extension in the "x" namespace, for which there is a schema - there is an extension in the "y" namespace, for which there is a schema - there is an extension in the "z" namespace, but no schema exists The objective is to write the extension schemas for "x" and "y" with no knowledge of each other, but still accommodate the unknown "z" schema (from some other community). Of course the importing schema must know of them, but that is okay, as it is the coordinating schema. The instances: - ken1.xml - an instance with an "x" extension - ken2.xml - an instance with a "y" extension - ken3.xml - an instance with a "z" extension The schemas: - ken.rnc, x.rnc, y.rnc - a working example of what I need in RELAX-NG - x.xsd, y.xsd - extension schemas for "x" and "y" - ken0.xsd - a schema without validation of extensions - ken1.xsd - a schema that validates "x" or any extension - ken2.xsd - a schema that validates "y" or any extension - ken3.xsd - a schema that validates "x" or "y" but nothing else, so it rightly complains with ken3.xml - ken4.xsd - a schema that says what I want but doesn't work with any instances because the schema is rejected Because ken1.xsd uses the wild card of "anything except X" and the ken2.xsd uses the wild card of "anything except Y", there is an ambiguity. "X" is accepted both by ken1.xsd explicity and ken2.xsd implicitly. Such an ambiguity isn't allowed. I've asked colleagues I respect if I have missed a feature of W3C Schema 1.0 that provides for two wild cards, and so far the only response I've received is "use NVDL", which of course will work, but that isn't the pure W3C Schema solution we've committed to delivering to our users. Thinking "there isn't one", I'm instructing UBL users that when they want to validate their XML instances with two possible known extensions and any number of unknown extensions, they have to prepare two sets of schemas, one validating one known extension and any other, and the other validating the other known extension and any other. Where RELAX-NG validates the instance with a single pass, the use of W3C Schema requires two schema passes with two different schemas. In the above list, "ken1.xsd" and "ken2.xsd". Please let me know if you have any questions. If I can make the call tomorrow, it will only be for about 15 minutes, so if you can post them publicly or to me privately, I'll try to be prepared to address them quickly in the call or on this list. Thanks! . . . . . . . . . . . Ken p.s. files attached are in ZIP format p.p.s. if I do get feedback from my colleagues, I'll summarize to the list what they tell me + echo The instances to be validated: The instances to be validated: + cat ken1.xml <a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z"> <extension> <x:thing>abc</x:thing> </extension> <b>xyz</b> </a> + cat ken2.xml <a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z"> <extension> <y:thing>abc</y:thing> </extension> <b>xyz</b> </a> + cat ken3.xml <a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z"> <extension> <z:thing>abc</z:thing> </extension> <b>xyz</b> </a> + echo RELAX-NG works: RELAX-NG works: + cat ken.rnc namespace x = "urn:X-x" namespace y = "urn:X-y" include "x.rnc" include "y.rnc" start = element a { element extension { x-thing | y-thing | element * - ( x-notX | y-notY ) { text } }?, element b { text } } + cat ken1.xml <a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z"> <extension> <x:thing>abc</x:thing> </extension> <b>xyz</b> </a> + rnc ken.rnc ken1.xml + cat ken2.xml <a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z"> <extension> <y:thing>abc</y:thing> </extension> <b>xyz</b> </a> + rnc ken.rnc ken2.xml + cat ken3.xml <a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z"> <extension> <z:thing>abc</z:thing> </extension> <b>xyz</b> </a> + rnc ken.rnc ken3.xml + echo W3C Schema with a wild card works, but 'doesn'\''t' validate: W3C Schema with a wild card works, but doesn't validate: + cat ken0.xsd <?xml version="1.0" encoding="US-ASCII"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="a"> <xsd:complexType> <xsd:sequence> <xsd:element name="extension"> <xsd:complexType> <xsd:sequence> <xsd:any processContents="skip"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="b"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:string"/> </xsd:simpleContent> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> + w3cschema ken0.xsd ken1.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.121) with no errors and no warnings. Saxon... No validation errors + w3cschema ken0.xsd ken2.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.120) with no errors and no warnings. Saxon... No validation errors + w3cschema ken0.xsd ken3.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.132) with no errors and no warnings. Saxon... No validation errors + echo W3C Schema with one vocabulary X and a wildcard works for all: W3C Schema with one vocabulary X and a wildcard works for all: + cat x.xsd <?xml version="1.0" encoding="US-ASCII"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x="urn:X-x" targetNamespace="urn:X-x"> <xsd:element name="thing"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:string"/> </xsd:simpleContent> </xsd:complexType> </xsd:element> <xsd:group name="notX"> <xsd:choice> <xsd:any namespace="##other" processContents="lax"/> </xsd:choice> </xsd:group> </xsd:schema> + cat key1.xsd cat: key1.xsd: No such file or directory + w3cschema ken1.xsd ken1.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.124) with no errors and no warnings. Saxon... No validation errors + w3cschema ken1.xsd ken2.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.133) with no errors and no warnings. Saxon... No validation errors + w3cschema ken1.xsd ken3.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.130) with no errors and no warnings. Saxon... No validation errors + echo W3C Schema with one vocabulary Y and a wildcard works for all: W3C Schema with one vocabulary Y and a wildcard works for all: + cat y.xsd <?xml version="1.0" encoding="US-ASCII"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x="urn:X-y" targetNamespace="urn:X-y"> <xsd:element name="thing"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:string"/> </xsd:simpleContent> </xsd:complexType> </xsd:element> <xsd:group name="notY"> <xsd:choice> <xsd:any namespace="##other" processContents="lax"/> </xsd:choice> </xsd:group> </xsd:schema> + cat ken2.xsd <?xml version="1.0" encoding="US-ASCII"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:y="urn:X-y"> <xsd:import namespace="urn:X-y" schemaLocation="y.xsd"/> <xsd:element name="a"> <xsd:complexType> <xsd:sequence> <xsd:element name="extension"> <xsd:complexType> <xsd:choice> <xsd:element ref="y:thing"/> <xsd:group ref="y:notY"/> </xsd:choice> </xsd:complexType> </xsd:element> <xsd:element name="b"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:string"/> </xsd:simpleContent> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> + w3cschema ken2.xsd ken1.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.128) with no errors and no warnings. Saxon... No validation errors + w3cschema ken2.xsd ken2.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.129) with no errors and no warnings. Saxon... No validation errors + w3cschema ken2.xsd ken3.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.123) with no errors and no warnings. Saxon... No validation errors + echo W3C Schema with two vocabularies and no wildcard works only for two: W3C Schema with two vocabularies and no wildcard works only for two: + cat ken3.xsd <?xml version="1.0" encoding="US-ASCII"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x="urn:X-x" xmlns:y="urn:X-y"> <xsd:import namespace="urn:X-x" schemaLocation="x.xsd"/> <xsd:import namespace="urn:X-y" schemaLocation="y.xsd"/> <xsd:element name="a"> <xsd:complexType> <xsd:sequence> <xsd:element name="extension"> <xsd:complexType> <xsd:choice> <xsd:element ref="y:thing"/> <xsd:element ref="x:thing"/> </xsd:choice> </xsd:complexType> </xsd:element> <xsd:element name="b"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:string"/> </xsd:simpleContent> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> + w3cschema ken3.xsd ken1.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.125) with no errors and no warnings. Saxon... No validation errors + w3cschema ken3.xsd ken2.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.124) with no errors and no warnings. Saxon... No validation errors + w3cschema ken3.xsd ken3.xml Xerces... Attempting validating, namespace-aware parse Error:file:///Users/admin/t/ftemp/ken3.xml:3:14:cvc-complex-type.2.4.a: Invalid content was found starting with element 'z:thing'. One of '{"urn:X-y":thing, "urn:X-x":thing}' is expected. Parse succeeded (0.149) with 1 error and no warnings. Saxon... Validation error on line 3 column 14 of ken3.xml: In content of element <extension>: The content model does not allow element <z:thing> to appear here. Expected one of: {urn:X-y}thing, {urn:X-x}thing (See http://www.w3.org/TR/xmlschema-1/#cvc-complex-type clause 2.4) No validation errors + echo W3C Schema with two wildcards 'doesn'\''t' work because of ambiguity W3C Schema with two wildcards doesn't work because of ambiguity + cat ken4.xsd <?xml version="1.0" encoding="US-ASCII"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x="urn:X-x" xmlns:y="urn:X-y"> <xsd:import namespace="urn:X-x" schemaLocation="x.xsd"/> <xsd:import namespace="urn:X-y" schemaLocation="y.xsd"/> <xsd:element name="a"> <xsd:complexType> <xsd:sequence> <xsd:element name="extension"> <xsd:complexType> <xsd:choice> <xsd:element ref="y:thing"/> <xsd:element ref="x:thing"/> <xsd:group ref="y:notY"/> <xsd:group ref="x:notX"/> </xsd:choice> </xsd:complexType> </xsd:element> <xsd:element name="b"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:string"/> </xsd:simpleContent> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> + w3cschema ken4.xsd ken3.xml Xerces... Attempting validating, namespace-aware parse Parse succeeded (0.144) with no errors and no warnings. Saxon... Error on line 12 of ken4.xsd: Error in complex type of element extension: Ambiguous content model, element <thing> appears in its own right, and also matches an <xs:any> wildcard Schema processing failed: Ambiguous content model, element <thing> appears in its own right, and also matches an <xs:any> wildcard
gkhkolman-schema-20100907-1330z.zzz
-- XSLT/XQuery training: after http://XMLPrague.cz 2011-03-28/04-01 Vote for your XML training: http://www.CraneSoftwrights.com/o/i/ Crane Softwrights Ltd. http://www.CraneSoftwrights.com/o/ G. Ken Holman mailto:gkholman@CraneSoftwrights.com Male Cancer Awareness Nov'07 http://www.CraneSoftwrights.com/o/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]