OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ubl message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Illustration of


Fellow UBL TC members,

Please bear with me with this long example below, 
but during the call last week I detected some 
disbelief at my assertion that W3C Schema does 
not allow the simultaneous validation of two or 
more independent UBL extensions in a single 
instance with a single pass of the W3C Schema 
validator.  I would *love* to be told that I'm 
wrong, but I cannot see where and this impacts 
the "simplicity" of things we are trying to 
create for our users.  I say "simplicity" because 
when an extension module is published, it should 
be useable as is, and read-only.  If I want two 
extensions supported, I shouldn't have to create 
some über-extension-schema with knowledge of all 
(and even then I don't think W3C Schema can handle the job).

The issue has to do with wild cards.  The 
extension point in all UBL schemas allows users 
to create their own extensions and augment the 
validation of their invoices (say) and have additional information.

Different communities will have different 
extensions.  Any given community will only know if its own extensions.

On top of this, the we are about to publish our 
first "standardized extension" and so users may 
wish to validate an instance that has two 
extensions, theirs and the committee's.  All 
extension schemas from committees or communities 
are read-only, but the "coordinating schema" is 
not read-only and can be changed to accommodate 
the importation of the read-only schemas.

In all of this, no read-only extension must need 
to know about other read-only extensions.

I have no problems describing a W3C Schema that 
accommodates one extension, or another schema 
that accommodates a different extension, but 
because of ambiguity I cannot seem to write a W3C 
Schema 1.0 schema that accommodates two 
extensions where either are allowed as well as something that isn't in either.

The long example is below, with many cases I can 
cover with W3C schema until I reach the very end 
were I cannot accommodate what I actually need, given this as a starting point:

  - there is an extension in the "x" namespace, for which there is a schema
  - there is an extension in the "y" namespace, for which there is a schema
  - there is an extension in the "z" namespace, but no schema exists

The objective is to write the extension schemas 
for "x" and "y" with no knowledge of each other, 
but still accommodate the unknown "z" schema 
(from some other community).  Of course the 
importing schema must know of them, but that is 
okay, as it is the coordinating schema.

The instances:

   - ken1.xml - an instance with an "x" extension
   - ken2.xml - an instance with a "y" extension
   - ken3.xml - an instance with a "z" extension

The schemas:

   - ken.rnc, x.rnc, y.rnc - a working example of what I need in RELAX-NG
   - x.xsd, y.xsd - extension schemas for "x" and "y"
   - ken0.xsd - a schema without validation of extensions
   - ken1.xsd - a schema that validates "x" or any extension
   - ken2.xsd - a schema that validates "y" or any extension
   - ken3.xsd - a schema that validates "x" or "y" but nothing else, so
                it rightly complains with ken3.xml
   - ken4.xsd - a schema that says what I want but doesn't work with any
                instances because the schema is rejected

Because ken1.xsd uses the wild card of "anything 
except X" and the ken2.xsd uses the wild card of 
"anything except Y", there is an ambiguity.  "X" 
is accepted both by ken1.xsd explicity and 
ken2.xsd implicitly.  Such an ambiguity isn't allowed.

I've asked colleagues I respect if I have missed 
a feature of W3C Schema 1.0 that provides for two 
wild cards, and so far the only response I've 
received is "use NVDL", which of course will 
work, but that isn't the pure W3C Schema solution 
we've committed to delivering to our users.

Thinking "there isn't one", I'm instructing UBL 
users that when they want to validate their XML 
instances with two possible known extensions and 
any number of unknown extensions, they have to 
prepare two sets of schemas, one validating one 
known extension and any other, and the other 
validating the other known extension and any 
other.  Where RELAX-NG validates the instance 
with a single pass, the use of W3C Schema 
requires two schema passes with two different 
schemas.  In the above list, "ken1.xsd" and "ken2.xsd".

Please let me know if you have any questions.  If 
I can make the call tomorrow, it will only be for 
about 15 minutes, so if you can post them 
publicly or to me privately, I'll try to be 
prepared to address them quickly in the call or on this list.

Thanks!

. . . . . . . . . . . Ken

p.s. files attached are in ZIP format

p.p.s. if I do get feedback from my colleagues, 
I'll summarize to the list what they tell me

+ echo The instances to be validated:
The instances to be validated:
+ cat ken1.xml
<a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z">
   <extension>
     <x:thing>abc</x:thing>
   </extension>
   <b>xyz</b>
</a>
+ cat ken2.xml
<a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z">
   <extension>
     <y:thing>abc</y:thing>
   </extension>
   <b>xyz</b>
</a>
+ cat ken3.xml
<a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z">
   <extension>
     <z:thing>abc</z:thing>
   </extension>
   <b>xyz</b>
</a>
+ echo RELAX-NG works:
RELAX-NG works:
+ cat ken.rnc
namespace x = "urn:X-x"
namespace y = "urn:X-y"

include "x.rnc"
include "y.rnc"

start = element a
    {
       element extension
       {
          x-thing
        |
          y-thing
        |
          element * - ( x-notX | y-notY ) { text }
       }?,
       element b { text }
    }
+ cat ken1.xml
<a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z">
   <extension>
     <x:thing>abc</x:thing>
   </extension>
   <b>xyz</b>
</a>
+ rnc ken.rnc ken1.xml
+ cat ken2.xml
<a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z">
   <extension>
     <y:thing>abc</y:thing>
   </extension>
   <b>xyz</b>
</a>
+ rnc ken.rnc ken2.xml
+ cat ken3.xml
<a xmlns:x="urn:X-x" xmlns:y="urn:X-y" xmlns:z="urn:X-z">
   <extension>
     <z:thing>abc</z:thing>
   </extension>
   <b>xyz</b>
</a>
+ rnc ken.rnc ken3.xml
+ echo W3C Schema with a wild card works, but 'doesn'\''t' validate:
W3C Schema with a wild card works, but doesn't validate:
+ cat ken0.xsd
<?xml version="1.0" encoding="US-ASCII"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema";>

<xsd:element name="a">
   <xsd:complexType>
     <xsd:sequence>
       <xsd:element name="extension">
         <xsd:complexType>
           <xsd:sequence>
             <xsd:any processContents="skip"/>
           </xsd:sequence>
         </xsd:complexType>
       </xsd:element>
       <xsd:element name="b">
         <xsd:complexType>
           <xsd:simpleContent>
             <xsd:extension base="xsd:string"/>
           </xsd:simpleContent>
         </xsd:complexType>
       </xsd:element>
     </xsd:sequence>
   </xsd:complexType>
</xsd:element>

</xsd:schema>
+ w3cschema ken0.xsd ken1.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.121) with no errors and no warnings.
Saxon...
No validation errors
+ w3cschema ken0.xsd ken2.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.120) with no errors and no warnings.
Saxon...
No validation errors
+ w3cschema ken0.xsd ken3.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.132) with no errors and no warnings.
Saxon...
No validation errors
+ echo W3C Schema with one vocabulary X and a wildcard works for all:
W3C Schema with one vocabulary X and a wildcard works for all:
+ cat x.xsd
<?xml version="1.0" encoding="US-ASCII"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"; xmlns:x="urn:X-x"
             targetNamespace="urn:X-x">

<xsd:element name="thing">
   <xsd:complexType>
     <xsd:simpleContent>
       <xsd:extension base="xsd:string"/>
     </xsd:simpleContent>
   </xsd:complexType>
</xsd:element>

<xsd:group name="notX">
   <xsd:choice>
     <xsd:any namespace="##other" processContents="lax"/>
   </xsd:choice>
</xsd:group>

</xsd:schema>
+ cat key1.xsd
cat: key1.xsd: No such file or directory
+ w3cschema ken1.xsd ken1.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.124) with no errors and no warnings.
Saxon...
No validation errors
+ w3cschema ken1.xsd ken2.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.133) with no errors and no warnings.
Saxon...
No validation errors
+ w3cschema ken1.xsd ken3.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.130) with no errors and no warnings.
Saxon...
No validation errors
+ echo W3C Schema with one vocabulary Y and a wildcard works for all:
W3C Schema with one vocabulary Y and a wildcard works for all:
+ cat y.xsd
<?xml version="1.0" encoding="US-ASCII"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"; xmlns:x="urn:X-y"
             targetNamespace="urn:X-y">

<xsd:element name="thing">
   <xsd:complexType>
     <xsd:simpleContent>
       <xsd:extension base="xsd:string"/>
     </xsd:simpleContent>
   </xsd:complexType>
</xsd:element>

<xsd:group name="notY">
   <xsd:choice>
     <xsd:any namespace="##other" processContents="lax"/>
   </xsd:choice>
</xsd:group>

</xsd:schema>
+ cat ken2.xsd
<?xml version="1.0" encoding="US-ASCII"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"; xmlns:y="urn:X-y">

<xsd:import namespace="urn:X-y" schemaLocation="y.xsd"/>

<xsd:element name="a">
   <xsd:complexType>
     <xsd:sequence>
       <xsd:element name="extension">
         <xsd:complexType>
           <xsd:choice>
             <xsd:element ref="y:thing"/>
             <xsd:group ref="y:notY"/>
           </xsd:choice>
         </xsd:complexType>
       </xsd:element>
       <xsd:element name="b">
         <xsd:complexType>
           <xsd:simpleContent>
             <xsd:extension base="xsd:string"/>
           </xsd:simpleContent>
         </xsd:complexType>
       </xsd:element>
     </xsd:sequence>
   </xsd:complexType>
</xsd:element>

</xsd:schema>
+ w3cschema ken2.xsd ken1.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.128) with no errors and no warnings.
Saxon...
No validation errors
+ w3cschema ken2.xsd ken2.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.129) with no errors and no warnings.
Saxon...
No validation errors
+ w3cschema ken2.xsd ken3.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.123) with no errors and no warnings.
Saxon...
No validation errors
+ echo W3C Schema with two vocabularies and no wildcard works only for two:
W3C Schema with two vocabularies and no wildcard works only for two:
+ cat ken3.xsd
<?xml version="1.0" encoding="US-ASCII"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema";
             xmlns:x="urn:X-x" xmlns:y="urn:X-y">

<xsd:import namespace="urn:X-x" schemaLocation="x.xsd"/>
<xsd:import namespace="urn:X-y" schemaLocation="y.xsd"/>

<xsd:element name="a">
   <xsd:complexType>
     <xsd:sequence>
       <xsd:element name="extension">
         <xsd:complexType>
           <xsd:choice>
             <xsd:element ref="y:thing"/>
             <xsd:element ref="x:thing"/>
           </xsd:choice>
         </xsd:complexType>
       </xsd:element>
       <xsd:element name="b">
         <xsd:complexType>
           <xsd:simpleContent>
             <xsd:extension base="xsd:string"/>
           </xsd:simpleContent>
         </xsd:complexType>
       </xsd:element>
     </xsd:sequence>
   </xsd:complexType>
</xsd:element>

</xsd:schema>
+ w3cschema ken3.xsd ken1.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.125) with no errors and no warnings.
Saxon...
No validation errors
+ w3cschema ken3.xsd ken2.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.124) with no errors and no warnings.
Saxon...
No validation errors
+ w3cschema ken3.xsd ken3.xml
Xerces...
Attempting validating, namespace-aware parse
Error:file:///Users/admin/t/ftemp/ken3.xml:3:14:cvc-complex-type.2.4.a: 
Invalid content was found starting with element 
'z:thing'. One of '{"urn:X-y":thing, "urn:X-x":thing}' is expected.
Parse succeeded (0.149) with 1 error and no warnings.
Saxon...
Validation error on line 3 column 14 of ken3.xml:
   In content of element <extension>: The content 
model does not allow element <z:thing> to
   appear here. Expected one of: {urn:X-y}thing, {urn:X-x}thing (See
   http://www.w3.org/TR/xmlschema-1/#cvc-complex-type clause 2.4)
No validation errors
+ echo W3C Schema with two wildcards 'doesn'\''t' work because of ambiguity
W3C Schema with two wildcards doesn't work because of ambiguity
+ cat ken4.xsd
<?xml version="1.0" encoding="US-ASCII"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema";
             xmlns:x="urn:X-x" xmlns:y="urn:X-y">

<xsd:import namespace="urn:X-x" schemaLocation="x.xsd"/>
<xsd:import namespace="urn:X-y" schemaLocation="y.xsd"/>

<xsd:element name="a">
   <xsd:complexType>
     <xsd:sequence>
       <xsd:element name="extension">
         <xsd:complexType>
           <xsd:choice>
             <xsd:element ref="y:thing"/>
             <xsd:element ref="x:thing"/>
             <xsd:group ref="y:notY"/>
             <xsd:group ref="x:notX"/>
           </xsd:choice>
         </xsd:complexType>
       </xsd:element>
       <xsd:element name="b">
         <xsd:complexType>
           <xsd:simpleContent>
             <xsd:extension base="xsd:string"/>
           </xsd:simpleContent>
         </xsd:complexType>
       </xsd:element>
     </xsd:sequence>
   </xsd:complexType>
</xsd:element>

</xsd:schema>
+ w3cschema ken4.xsd ken3.xml
Xerces...
Attempting validating, namespace-aware parse
Parse succeeded (0.144) with no errors and no warnings.
Saxon...
Error on line 12 of ken4.xsd:
   Error in complex type of element extension: 
Ambiguous content model, element <thing>
   appears in its own right, and also matches an <xs:any> wildcard
Schema processing failed: Ambiguous content 
model, element <thing> appears in its own right, 
and also matches an <xs:any> wildcard

gkhkolman-schema-20100907-1330z.zzz


--
XSLT/XQuery training:   after http://XMLPrague.cz 2011-03-28/04-01
Vote for your XML training:   http://www.CraneSoftwrights.com/o/i/
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/o/
G. Ken Holman                 mailto:gkholman@CraneSoftwrights.com
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/o/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]