[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Code list discussion so far
Hello UBL TC, We held the first of two discussions to resolve the code list issue this morning; the second of the two will take place Thursday afternoon (1 p.m. in Ottawa). Preliminary outcomes are as follows. - There are serious use cases that require modifications to code lists in the interval between official revisions of code lists. This is especially true in the case of industry-specific code lists. - Solutions that require the namespaces in the UBL schemas to be changed when a code list is modified are very expensive. - There appear to be three ways to accomplish modifications to UBL schemas without changing the namespaces: 1. Users simply modify the file containing the code list while leaving everything else alone. This method is being used successfully in Denmark. Obviously we cannot prevent users from doing this, and given a proper notification procedure, it seems to work pretty well. 2. We explicitly enable modifications to the code lists by embedding a "substitution group hook" in the UBL schemas as described by Tony Coates and Marty Burns. While cleaner from a conceptual point of view, we're finding it difficult to see any big advantage of this approach over simply swapping out one code list module and replacing it with another one. The basic notification and management issues appear to be about the same. 3. We take a radically different view of the problem by distinguishing between two kinds of code lists: a. Code lists that define codes used only in UBL (status codes, for example). Such lists are typically well-defined, are completely under our control, and are not (or should not be) extensible. b. Code lists that are defined by outside agencies and referenced in UBL. These are conceptually distinct from the first category even if some happen to be bundled into the UBL package. Making this distinction would allow us to take two different approaches to code list definition. Code lists of the first kind could be defined in schema modules using enumerations just as we do in 1.0. Code lists of the second kind could be defined in XML instances of a standard code list schema, with the codes of this kind declared as unrestricted strings. Ordinary XSD validation would be used for the first kind of code list, just as in 1.0, whereas validation of the second kind would typically take place in a second validation phase using something like Schematron. Participants in the discussion noted the following points regarding this third alternative: - Publishing standard code lists as instances of a standard code list schema is much closer to the basic XML concept than publishing code lists as schema fragments. In fact, the whole namespace problem we've been wrestling with here can be seen as an artifact of the attempt to use things that should change very rarely (schemas) to publish things that people often want to modify (code lists). One result of this has been that instead of recommending a standard for code lists using a standard formalism (such as an XSD schema) we have been recommending a template for code list schemas for which there is no standard formalism, just a complex set of prose descriptions supplemented by examples. The code list paper published in UBL 1.0 admits it to be "desirable that the [code list] data model be expressed in a machine readable form" but can do no more than to place this desirable development in some distant future where a formalism exists for doing so. The definition of a standard XML schema for code lists would solve this simply by putting such a definition at the appropriate conceptual level. - Defining codes as unrestricted strings would obviously make it trivially easy to meet all the requirements for ad hoc code list modification. The tradeoff would be that the code lists themselves could no longer be used to directly drive XSD validation. It is unlikely, however, that any major user of the UBL schemas would be satisfied with just a simple check against an enumeration before entering the document into an accounting application; it is much more likely that something like a Schematron check would be performed following simple XSD validation. This is in fact what is done in the Danish implementations, and it more closely reflects an initial premise of the UBL code list effort that most code list validation would take place at the application level (report of the NDRSC, 18 March 2002). - Post-schema validation appears to be less problematic than what we're hearing from initiatives that are attempting to use substitution groups. We believe it to be significant that this is the approach adopted for ISO 20022 (banking). - We could provide a mechanism (an XSLT transformation, for example) that would take *any* code list published using the standard code list schemas and generate code list schema modules just like the ones we've included in UBL 1.0. (The XSLT would, in effect, provide the missing formalism needed to specify the construction of the schema modules in a machine-readable way.) In fact, we could provide the modules so generated as part of the release package together with instructions for validating instances against these generated modules in a second XSD pass, thus providing all of the advantages of validation against enumerations while still allowing easy modification of code lists. - In a separate decision, the TC decided this morning to accept the UDT and CCT schema modules defined by UN/CEFACT ATG2 rather than defining and maintaining our own. Those schema modules reference a few standard code lists (currencies, language codes, units of measure, mime media types) that would retain the old enumeration form. As most real-world situations requiring code list modification are encountered not with these very basic standard lists but with industry-defined code lists, this is not considered a problem. - Mark Crawford wished to be put on record as having reservations about this approach for two reasons: 1. The desirability of maintaining XSD validation of codes, and 2. The wish to maintain alignment with ATG2, which intends to specify all code lists as schema modules. It is recognized, however, that ATG2 does not have customization as a goal, whereas we do (though to what extent still remains to be determined). - To make this approach practical for users, it will be necessary to provide documentation showing users how to implement a post-schema code list validation phase using Schematron. Bryan Rasmussen has volunteered to create this if his management will approve the work. Everyone interested in this subject should be prepared to participate in tomorrow's follow-up discussion (1 p.m. Ottawa time Thursday 11 August at the usual UBL conference number). Jon
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]