[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: RE: [ubl-ndrsc] Code lists: discussion kickoff
The problem with this is that the codelists tend to be very long at the schema level but much more restricted at the instance level (after application of context). In other words, my currency list might contain 100 items, but in reality for my specific application only 5 are likely. So using enumerations to generate forms directly from a schema with nice dropdown lists for the codelists isn't all that advantageous. You still need a mechanism for specifying which items from the huge grab bag of choices is actually needed for the application. That's why I think it might be worthwhile to reject the idea of enumeration altogether and just go with appInfo. Maybe we could use our context mechanism to create more manageable context-specific enumerations in the schema... Matt -----Original Message----- From: Gregory, Arofan [mailto:arofan.gregory@commerceone.com] Sent: Thursday, January 31, 2002 10:58 PM To: 'Phil Griffin'; ubl-ndrsc@lists.oasis-open.org Subject: RE: [ubl-ndrsc] Code lists: discussion kickoff Phil: Let me try to answer your points in a general way: First, when we talk about "code lists" I am assuming that we are restricting ourselves (as we did in xCBL) to those lists of commonly used, well-defined, externally-maintained "codes" that come from places like X12, ISO, and the UN/CEFACT Codes Working Group. For xCBL, we harmonized these codes in some cases, and in others, we chose to subset them for our own uses, but we have clear maps back to the definitions commonly understood in business today. We *cannot* use any controlled construction in UBL - be it an element or attribute name, or a value in an enumerated list - that we do not in some way completely and unambiguously define. Otherwise, we have failed in creating a useful language for e-business. In general, I agree with you - we *must* be unambiguous, using formal references to that work of other bodies that we base ours on, if indeed we choose to do this. As for alphabetic constraints, XSD does give us the ability to do pattern constraints called "regular expressions", so I think we could do what you suggest, but a simple enumeration datatype will get us to the same place. I don't think parsers yet support the regular expression stuff, although they might. As for validation, you do have a good point - few users ever support all of the codes in a long code lists. But the validation issue depends on something else. My mental picture of how SMEs will use this stuff has a lower bound, which is that they view business documents in a browser, based on a hosted application that can do only two simple things: (1) parse the document against the schemas; and (2) run it through an XSL or CSS stylesheet to produce a display form compatible with today's web-browser technology. There are several companies - mine among them - that offer this type of low-level, hosted functionality, and it is generally seen as the basic replacement for FAX-based processes used by the EDI VANs, called "Rip & Read". These applications - because they are generic, XML-based applications - typically do not offer detailed functionality about the mappings between sets of codes that are common in more fully automated EDI implementations. Because of this, I feel that being able to validate code lists with generic XSD parsers is very important. So is limiting, to the extent practical, the sets of enumerated values that people use to express semantics within UBL. Cheers, Arofan -----Original Message----- From: Phil Griffin [mailto:phil.griffin@ASN-1.com] Sent: Thursday, January 31, 2002 12:03 PM To: ubl-ndrsc@lists.oasis-open.org Subject: Re: [ubl-ndrsc] Code lists: discussion kickoff "Gregory, Arofan" wrote: > > Folks: > > I've thought a lot about this issue, and I believe the trade-off is this: > > (1) Using elements to represent codes is one possibility, that gives us the > advantage of being able to validate a code from a controlled list. Also, if > we wrap these in a parent type, the list can be extended. (Ugly, but it > works.) For companies that have expensive validation software to handle > code-lists, this isn't a problem, but it is a problem for the little guys. > We can get free code-list standardization and validation from this approach, > which I think is good. The down-side is that designing and maintaining these > code-lists is a bitch. (Many, many versions of our schemas that do nothing > but update code-lists). Perhaps we could have special namespaces for > codelists, and have special rules so that versioning is not done by > namespace but with an attribute? Just a thought. Just a point here. Code lists in themselves do not always guarantee interworking applications. Unless each code list item is bound to an unambiguous textual definition there can still be problems. Case in point, the characters "AML". When the notion of using ASN.1 as an XML schema was first proposed, I used these characters to describe our work. But when we did a google search we found so many other uses of these same characters, we switched to XER. So code lists can help in validation, but they may not provide a 100% solution even when the list of codes is fixed. And my guess is that the longer the list of codes, and the greater the number of list users from different disciplines, the more likely such problems will arise. The result: you and I will both use AML, each of us with a totally different meaning. > (2) Using the "string" approach will absolutely defeat any hope of > interoperability without benefit of expensive translation software. The EDI > experience has shown that people will happily invent their own > non-interoperable codes. In xCBL we allowed for this with the "CodedOther" > approach: all code lists have an enumeration of choices, and then a sister > element that holds a non-standard code. If you choose the "Other" code, then > you have to fill in the string. This approach is not, in my opinion, the > best solution, but it may be the best we can do with XML Schema. Using just I agree. This approach while not perfect as you say is at least a far more simple one than you describe below. Can we go this way for version one (for speed of work) and change our minds in a later version to a more complex solution such as you describe below without causing significant problems? > a string makes it not necessary to maintain codelists at all, but sacrifices > much of the benefit of having a UBL, in my opinion. It does push the actual validation off to the application. But given the length of the code list examples I've seen, I wonder, if for a given user whether all of the ones listed would REALLY be valid for that user's application? Seems to me, as an example, if I only ship to the US and Canada, that for my document only USA and CAN might be valid out of the list of all country codes. What benefit would I get from JAP and FRA being valid? When an actual instance document is created for a UBL user, will we provide support for specifying further granularity of code list constraints? > (3) Codelists as enumerated data types. This is my preferred approach - a > codelist is, in fact, an enumeration of specific semantics, and this format > makes it clear and easier to manage. What we need is an ability to extend > these (a major failing of XML schema). I have an enumerated type in my favorite schema language, but essentially its named values are treated as integers. But I can also view code lists differently using what is termed a permitted alphabet constraint, a set of the sets of characters that determine what is valid for an instance of a given user defined type. This allows me to express the valid sets of characters that can be used in a given field of some type, say as MyCodeList ::= UTF8String ("ABC" | "BAX", ... ) The "extension marker" ( ... ) instructs tools to also expect other values not in the list, so I do not need to code up an "Other" choice alternative. But I am almost certain that such permitted alphabet constraints do not exist in XSD. > Let me suggest: > > (1) Dedicated namespaces for codelists (one per codelist, or related group > of codelists) > (2) Alow these namespaces to be static - that is, not versioned. > (3) Have a "version" associated with the codelist in a way that does not > change the name of the namespace. (Could we use XSD "version" for this?) > > This way, we could version our structures and our codelists separately. > This models the best part of EDI, where it is common practice to update > codelists versions within an older version of message structures. And all > this, while not throwing away the ability to validate codelists with a > parser. This seems a reasonable approach. But how is interoperability maintained when a code list item is removed? Are we affected if an item with one meaning in code list version A is given another meaning in code list version B? My question here is what happens in terms of interoperability if you are using A and I am using B? Phil > To subscribe or unsubscribe from this elist use the subscription > manager: <http://lists.oasis-open.org/ob/adm.pl> > ---------------------------------------------------------------- To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.oasis-open.org/ob/adm.pl>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC