[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Candidate approach for W3C Schema with wild cards for code list value validation
This is to Tony Coates's attention, but I am anxious to hear from any others if there are flaws in my approach that I document in this post. Based on today's call, I was unclear on the restrictions of using lax validation in W3C Schema, so I created a test here that I believe illustrates the use of W3C Schema as a "second pass" value validation, checking *only* the value of the currency attribute *and no structure bits whatsoever*. This approach would be used as an alternative to using ISO 19757-3 Schematron that others may find more palatable. To recap, from Tony's summary posted as a password-protected file announced in: http://lists.oasis-open.org/archives/ubl/200508/msg00043.html ... we are contemplating a two-pass validation, the first pass being a structural validation with "class 2" code list information items unvalidated, and a second pass where *only* "class 2" code list information items have their values validated. Many use XPath and ISO 19757-3 Schematron for this second pass, reaching into the instance and checking only the values. This *necessarily* must be a second pass because the first pass ensures the structural integrity of the node tree into which the XPath expressions reach ... without structural integrity there is no integrity in the XPath evaluations and one could get false positives (among other unexpected results) for tests (in the general case). The question today was: for those who cannot use ISO 19757-3 Schematron, how would one use W3C Schema technology for the second pass where only the code list information item values are being checked against an enumeration. I've uploaded an unprotected .ZIP with an example, using currency (even though I know that currency is "class 1" and isn't one of the "class 2" ones we are going to extend) because of my limited time, but it shows the principles I had in mind: http://www.oasis-open.org/committees/download.php/13998/codelist-xsd-gkholman-20050811-1940z.zip I have two test UBL 1.0 instances, test.xml is a copy of the office invoice instance, testbad.xml is the same data with an invalid currency code. I changed "codelist\UBL-CodeList-CurrencyCode-1.0.xsd" to be unrestricted normalized string (we talked in the room about using NMTOKEN; I'm not sure why 1.0 used normalized string and not NMTOKEN, so I left it as normalized string), with all of the required attributes. Thus, with "maindoc\UBL-Invoice-1.0.xsd", the structure of cbc:TotalTaxAmount will be validated, but its actual value will be any normalized string, thus accomplishing the structural validation without value validation. I then created "maindoc\CL-Invoice-1.0.xsd" which allows anything anywhere, but with lax validation, so that if any other declarations are present, those are validated. That imports: (1) - "codelist/CL-CodeList-CurrencyCode-1.0.xsd" that defines the type as allowing any attributes but with an enumeration of the allowed values (2) - "common\CL-CommonBasicComponents-1.0.xsd", which only has a declaration for cbc:TotalTaxAmount, so it is the only item that will be validated No other changes were made. From what I can tell, "maindoc\CL-Invoice-1.0.xsd" is a schema that allows any structure anywhere in the instance, but constrains the value of cbc:TotalTaxAmount (found anywhere) to be from an enumerated list. The "test.bat" file assumes there is a "w3cschema.bat" file on the path that validates instances using the model expressed in the first argument. In my environment, I run Sun MSV for my W3C Schema validation (I haven't tested this with any other W3C Schema processor). This appears to indicate that my approach to non-structural value-only validation with W3C Schema expressions works: ===8<--- T:\test2>test T:\test2>call w3cschema xsd\maindoc\UBL-Invoice-1.0.xsd test.xml No validation errors. T:\test2>call w3cschema xsd\maindoc\UBL-Invoice-1.0.xsd testbad.xml No validation errors. T:\test2>call w3cschema xsd\maindoc\CL-Invoice-1.0.xsd test.xml No validation errors. T:\test2>call w3cschema xsd\maindoc\CL-Invoice-1.0.xsd testbad.xml start parsing a grammar. validating testbad.xml Error at line:58, column:84 of file:///T:/test2/testbad.xml attribute "amountCurrencyID" has a bad value: the value is not a member of the enumeration. the document is NOT valid. T:\test2> ===8<--- It would appear that I do not need a wrapper element from another namespace, as was discussed during the call. Have I messed up somewhere? I suspect this is a design pattern for any kind of value-only validation we need to implement. I suppose a complete solution would have "common\CL-CommonBasicComponents-1.0.xsd" declare only those elements that have a type from the class-2 code lists, and the "codelist\CL-*.xsd" files would have all the enumerations synthesized from the code list value instances and the "maindoc\CL-*.xsd" files would import the codelist files with the desired enumerations. My reference to "my limited time" above is because I really have to focus on the HISC output specifications and I don't have much time to help out more on the code list stuff, but I did commit today to documenting my ideas for a W3C Schema-based second-pass methodology as I hope I have done in this note. I can work later on the XSLT to synthesize the enumerations if you wish, Tony, or leave it with you and commit to help you with any questions you may have. Please let me know if anyone has any questions. . . . . . . . Ken -- World-wide on-site corporate, govt. & user group XML/XSL training. G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/o/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995) Male Breast Cancer Awareness http://www.CraneSoftwrights.com/o/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]