OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ubl message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: Wednesday's code list "meta-data block" implementation proposal


(Copied the whole UBL list so that this hopefully goes out to all involved.)

This is a response to my own message, with a few more thoughts on the 
code-related supplementary components.  I believe these apply no matter 
which mechanism is chosen, though the specific consequences might be 
different for each.

When I began to talk to code list owners and users a year or so ago, the 
first thing I learned was that there was generally a pretty poor matchup 
between the way the owners  characterize their own code lists and the 
official supplementary components.  It seemed that many code lists 
aren't very scientific about version numbers, and some don't use them at 
all; rather, they use validity periods/expiration dates.  If we were 
relying on code list owners to fill in all these bits of information 
themselves, they would need a lot of help, and might even have to think 
differently about their management of their own code lists.

One thing we always intended to allow (as far as I know) was the ability 
to add new fields that could function as "application-specific" 
supplementary components.  An expiration date is an example.  If we 
still agree with this, then extensibility would be an important 
characteristic of the chosen mechanism.  E.g., you'd probably want to 
give a meta-data block element <xs:anyAttribute>.

(I could dig up the email exchanges I had with the UN/ECE folks if you'd 
like to see more; I believe I forwarded them all to the NDRSC list.)

	Eve

Eve L. Maler wrote:

> Hi Ken and everyone,
> 
> G. Ken Holman wrote:
> 
>> Two requirements not present in the 2004-02-12 code list 
>> representation working draft are:
>>
>> (A) - that the supplementary components of the code lists of code list 
>> values utilized in a UBL instance be available in the XML instance 
>> proper without any processing from any external source including any 
>> schema expression
> 
> 
> While I made that argument mildly yesterday, I can see how a namespace 
> URI made up of special parts can become known by processors as the 
> (opaque-string) "name" or "identifier" of that particular mix of 
> options.  So this isn't as important to me as the oddity of requiring 
> namespace URIs to have a certain format, no matter who's assigning them.
> 
>> (B) - that the supplementary components be available for all 
>> code-list-value information items even when two or more such 
>> information items are found in the set of #PCDATA and attribute 
>> information items for any given element
>>
>> One interpretation of the approach that was floated today (termed 
>> colloquially as the "meta-data block" approach) is as follows:
>>
>> (1) - every UBL document shall have as many code list meta-data 
>> declarations as code lists from which values are used in the document
>>
>> (2) - all code list meta-data declarations shall be grouped in a 
>> container element at the start of the document
>>
>> (3) - a validated UBL document has confirmed that for every code 
>> list-based information item in the instance there exists a code list 
>> meta-data declaration
> 
> 
> At the minimum, of course, you get this for free because of the reliance 
> on XML namespaces for code.  <foo:elem>code</foo:elem> and <elem 
> bar:att="code"/> both require foo: and bar: to be "declared".  The next 
> issue below is where it gets more complicated.
> 
>> - validation of an XML instance does require two processes, one 
>> expressed in W3C schema constraints and an application-level 
>> constraint check that cannot be expressed using W3C schema.  At the 
>> meeting I conjectured that it could all be done in W3C schema but only 
>> when writing it up did I realize that the optional nature of code list 
>> values in the instance requires the cardinality of the meta-data block 
>> to be optional and there are no W3C schema co-occurrence constraints 
>> to check the presence of two optional components.  It was noted at the 
>> meeting that one should be able to express this constraint in 
>> Schematron assertions.
> 
> 
> I'm not sure I understand the concern here.  It was always the case, 
> because of XSD's lack of co-occurrence constraints, that all the 
> supplementary components for codes had to be "optional" and we had to 
> document "patterns" of usage depending on the type and source of the 
> code list.  The February 8 code list document shows this in Section 3.3, 
> "Examples of Use".  It's never the case that *all* of the supplementary 
> components need to be filled in, because some pairs of them provide 
> alternative means of supplying the same basic info ("what agency is this 
> from?" etc.).
> 
> So in both the new proposal and all previous proposals considered, it 
> was never possible to use XSD to enforce provision of the "right" set of 
> supplementary components.  I just realized that with a variation on the 
> meta-data block proposal, by the way, this *could* be enforced!  You 
> could have a unique meta-data block element for each pattern, like so:
> 
> <!-- all attribs required -->
> <codelist-managed-by-de3055-agency
>   xmlns:...="..."
>   listID="..."
>   listVersionID="..."
>   listAgencyID="..."
> />
> 
> <!-- all attribs required -->
> <codelist-managed-by-other-std-agency
>   xmlns:...="..."
>   listID="..."
>   listVersionID="..."
>   listAgencyID="..."
>   listAgencySchemeID="..."
>   listAgencySchemeAgencyID="..."
> />
> 
> <!-- etc. -->
> 
> But this may very well be overkill, which is perhaps why we haven't 
> considered it before.  It would have been possible to add it to the 
> datatyping proposal we've got now.
> 
>> (4) - two ways of declaring the association of supplementary 
>> components, both of which utilize a generic UBL code-list meta-data 
>> declaration container as the first child element of the document 
>> element, dereference the association differently:
>>
>> (4.1) - the code list meta-data declaration consists of a generic UBL 
>> element type and a required URI-typed attribute identifying the 
>> namespace URI of the code list from which a coded value is used in an 
>> information item with the same namespace URI, and optional attributes 
>> for the supplementary components
>>
>>      <in:invoice xmlns:curr="---arbitrary-currency-URI-here"
>>                  xmlns:in="---UBL invoice--" xmlns:cl="---UBL code 
>> list---"
>>                  ...>
>>        <cl:meta-data-block>
>>           <cl:meta-data ns="---arbitrary-currency-URI-here"
>>                         agency="xxx" version="yyy" .../>
>>        </cl:meta-data-block>
>>        ....
>>        <amount curr:currency="CAD">123</amount>
> 
> 
> (I would imagine that cl:meta-data-bock could be in the common module 
> and not need a namespace that's unique to code lists...)
> 
>> (4.2) - the code list meta-data declaration consists of an element 
>> type whose namespace URI matches the namespace URI of the code list 
>> from which a coded value is used in an information item with the same 
>> namespace URI, and optional attributes for the supplementary components
>>
>>      <in:invoice xmlns:curr="---arbitrary-currency-URI-here"
>>                  xmlns:in="---UBL invoice--" xmlns:cl="---UBL code 
>> list---"
>>                  ...>
>>        <cl:meta-data-block>
>>           <curr:meta-data agency="xxx" version="yyy" .../>
>>        </cl:meta-data-block>
>>        ....
>>        <amount curr:currency="CAD">123</amount>
>>
>> Both 4.1 and 4.2 are naming issues and are orthogonal to the data 
>> types and other code list schema fragment issues.  Refer to the code 
>> list requirements documents and other CLSC members for data type 
>> issues (that I am not familiar with; I'm focused on this naming 
>> requirement).
>>
>> Method 4.1 allows for a single generic declaration of the meta-data 
>> block and its contents, not requiring when adding one's own custom 
>> code list any schema changes to the meta-data block (though schema 
>> changes are required for the custom code list use and probably data 
>> type).
>>
>> Method 4.2 requires changing the schema for the meta-data block when 
>> also changing the schema for a custom code list.
> 
> 
> The only benefit I can see for method 4.2 is that the code list schema 
> module that actually defines the relevant simple type constraining code 
> values could have the same namespace and could be made to contain the 
> same "code list declaration", which has benefits for in-place 
> documentation of the code list schema module itself.  But there are 
> other ways to achieve this, without the added cost of changing the 
> meta-data block schema every time you touch the code list schema module.
> 
>> Out-of-the-box-UBL defines the code list namespace URI strings for all 
>> of the code lists (and they are arbitrary and need not have namespace 
>> URI fields) ... but because they are arbitrary, a user can change in 
>> the instance a code list's supplementary components without any 
>> validation they are the same supplementary components as 
>> out-of-the-box-UBL.
> 
> 
> I'm not sure I understand this, though it may be because I haven't kept 
> up with the latest status of OOTB UBL on this point.  But our previous 
> datatyping solution certainly allowed code list owners to set up code 
> list schema modules with supplementary-component attributes whose values 
> were *not* fixed, which could have allowed UBL document creators to mess 
> with the values of those attributes right in the instance, right next to 
> the relevant code.  The same sort of abuse (?) thus would have been 
> possible before.
> 
>> Does this last point mean we've missed the boat (again!) because two 
>> trading partners can use the same schema and same URIs but change the 
>> supplementary components in the instance?  Or are business issues 
>> covered because any changes to the supplementary components are, 
>> indeed, in the instance and not hidden anywhere?  Stephen, are legal 
>> concerns addressed by all this?  Or is this a "feature" for the end 
>> user of UBL?
> 
> 
> I believe that we need to think of the code list schema module's target 
> namespace as the "real" namespace URI assignment, and the declaration of 
> that namespace URI in a meta-data block as a *reference* to the 
> namespace defined by the code list owner, in precisely the sense that 
> supplying a code is intended to convey a *reference* to the real, 
> official meaning of that code as defined by its owner.  Anything else is 
> madness.
> 
> As long as a sender creates a UBL document that (a) supplies a valid 
> code in an element/attribute field and (b) references the intended code 
> list namespace URI, the recipient (along with any intermediate auditors) 
> gets both semantic clarity and "availability" of those semantics, albeit 
> with the thinnest possible indirections from the point of code usage to 
> the top of the document.
> 
>> p.s. I'll reiterate again that I'm helping document the discussions 
>> this week on naming of information items with coded values, but that I 
>> am *not* the person with which to discuss the latest CLSC paper and 
>> issues of data types ... I've posted my comments on that to the CLSC 
>> list earlier today for input into the discussion by the CLSC group
> 
> 
> I've commented a bit here on the connection of this proposal to the code 
> list schema module.  Ideally the final worked example will make this 
> connection explicit and test it to make sure it holds up.
> 
> Talk to you folks soon,
> 
>     Eve

-- 
Eve Maler                                        +1 781 442 3190
Sun Microsystems                            cell +1 781 354 9441
Web Products, Technologies, and Standards    eve.maler @ sun.com



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]