ubl-ndrsc message

Subject: Re: [ubl-ndrsc] Code lists: discussion kickoff
From: Phil Griffin <phil.griffin@ASN-1.com>
To: ubl-ndrsc@lists.oasis-open.org
Date: Fri, 01 Feb 2002 08:57:32 -0500
Arofan,

"Gregory, Arofan" wrote:
> 
> Phil:
> 
> Let me try to answer your points in a general way:
> 
> First, when we talk about "code lists" I am assuming that we are restricting
> ourselves (as we did in xCBL) to those lists of commonly used, well-defined,
> externally-maintained "codes" that come from places like X12, ISO, and the
> UN/CEFACT Codes Working Group. For xCBL, we harmonized these codes in some
> cases, and in others, we chose to subset them for our own uses, but we have
> clear maps back to the definitions commonly understood in business today.
> 
> We *cannot* use any controlled construction in UBL - be it an element or
> attribute name, or a value in an enumerated list - that we do not in some
> way completely and unambiguously define. Otherwise, we have failed in
> creating a useful language for e-business.
> 
> In general, I agree with you - we *must* be unambiguous, using formal
> references to that work of other bodies that we base ours on, if indeed we
> choose to do this.
> 
> As for alphabetic constraints, XSD does give us the ability to do pattern
> constraints called "regular expressions", so I think we could do what you
> suggest, but a simple enumeration datatype will get us to the same place. I
> don't think parsers yet support the regular expression stuff, although they
> might.

ASN.1 has a standard regular expression notation based on
the one from XSD with minor corrections and two others in
common use. But there is probably little tool support for
this feature (outside of Paul Thorpe's :-) and regular 
expressions are nasty to look at and easy to get rong. I
think I've only used one in all the standards I've written:

   DNSName ::= 
      VisibleString (SIZE(1..MAX)) (PATTERN "[A-Za-z0-9 .-]*")

Not something that should ever be inflicted on a business
person (or children and small animals for that matter).

> As for validation, you do have a good point - few users ever support all of
> the codes in a long code lists. But the validation issue depends on
> something else. My mental picture of how SMEs will use this stuff has a
> lower bound, which is that they view business documents in a browser, based
> on a hosted application that can do only two simple things: (1) parse the
> document against the schemas; and (2) run it through an XSL or CSS
> stylesheet to produce a display form compatible with today's web-browser
> technology.

Ah. I have a slightly different vision. I see here, rather than 
generalized UBL documents formed from XML markup under girded with
some sort of schema for validation purposes, more tightly defined
messages in small protocols. Subsets if you will extracted from
the bigger pool of UBL generality.

I see bank customers making purchases through agreed to common 
business interfaces on wireless phones.  I see RFIDs in the supply
chain beaconing, "I'm an invoice for your purchase order #56789". 
So I see a need for minimizing code space on the device and bits
on the line during transfer.

Rather than on the fly negotiation between business partners as
to the characteristics of the business forms that they will use,
I see large corporations and government entities dictating the
form and use of specific UBL instances to their partners and
clients - use this if you intend to get paid.

And I see processing and security requirements associated with
these messages, not just some markup tags with strong typing.
This is what I mean by protocol. 

> There are several companies - mine among them - that offer this type of
> low-level, hosted functionality, and it is generally seen as the basic
> replacement for FAX-based processes used by the EDI VANs, called "Rip &
> Read". These applications - because they are generic, XML-based applications
> - typically do not offer detailed functionality about the mappings between
> sets of codes that are common in more fully automated EDI implementations.
> 
> Because of this, I feel that being able to validate code lists with generic
> XSD parsers is very important. So is limiting, to the extent practical, the
> sets of enumerated values that people use to express semantics within UBL.

Agree. But I would see the ability to cull these list 
for a given instance of UBL document as an important 
document construction feature, similar to the need to
order the fields and include or exclude some optional
generalized document features.

As to transfer syntax, security and schemas, maybe what
I ultimately need is not UBL, but a derivative or version
of UBL that is mobile capable. M-UBL, or MUBL - I'm liking
the sound of that nmemonic. So maybe what I'm after here
is out of scope for this TC.

Phil

> Cheers,
> 
> Arofan
> 
> -----Original Message-----
> From: Phil Griffin [mailto:phil.griffin@ASN-1.com]
> Sent: Thursday, January 31, 2002 12:03 PM
> To: ubl-ndrsc@lists.oasis-open.org
> Subject: Re: [ubl-ndrsc] Code lists: discussion kickoff
> 
> "Gregory, Arofan" wrote:
> >
> > Folks:
> >
> > I've thought a lot about this issue, and I believe the trade-off is this:
> >
> > (1) Using elements to represent codes is one possibility, that gives us
> the
> > advantage of being able to validate a code from a controlled list. Also,
> if
> > we wrap these in a parent type, the list can be extended. (Ugly, but it
> > works.) For companies that have expensive validation software to handle
> > code-lists, this isn't a problem, but it is a problem for the little guys.
> > We can get free code-list standardization and validation from this
> approach,
> > which I think is good. The down-side is that designing and maintaining
> these
> > code-lists is a bitch. (Many, many versions of our schemas that do nothing
> > but update code-lists). Perhaps we could have special namespaces for
> > codelists, and have special rules so that versioning is not done by
> > namespace but with an attribute? Just a thought.
> 
> Just a point here. Code lists in themselves do not
> always guarantee interworking applications. Unless
> each code list item is bound to an unambiguous
> textual definition there can still be problems.
> 
> Case in point, the characters "AML". When the notion
> of using ASN.1 as an XML schema was first proposed, I
> used these characters to describe our work. But when
> we did a google search we found so many other uses of
> these same characters, we switched to XER.
> 
> So code lists can help in validation, but they may not
> provide a 100% solution even when the list of codes is
> fixed. And my guess is that the longer the list of codes,
> and the greater the number of list users from different
> disciplines, the more likely such problems will arise.
> 
> The result: you and I will both use AML, each of us with
> a totally different meaning.
> 
> > (2) Using the "string" approach will absolutely defeat any hope of
> > interoperability without benefit of expensive translation software. The
> EDI
> > experience has shown that people will happily invent their own
> > non-interoperable codes. In xCBL we allowed for this with the "CodedOther"
> > approach: all code lists have an enumeration of choices, and then a
> sister
> > element that holds a non-standard code. If you choose the "Other" code,
> then
> > you have to fill in the string. This approach is not, in my opinion, the
> > best solution, but it may be the best we can do with XML Schema. Using
> just
> 
> I agree. This approach while not perfect as you
> say is at least a far more simple one than you
> describe below. Can we go this way for version
> one (for speed of work) and change our minds in
> a later version to a more complex solution such
> as you describe below without causing significant
> problems?
> 
> > a string makes it not necessary to maintain codelists at all, but
> sacrifices
> > much of the benefit of having a UBL, in my opinion.
> 
> It does push the actual validation off to the
> application. But given the length of the code
> list examples I've seen, I wonder, if for a
> given user whether all of the ones listed would
> REALLY be valid for that user's application?
> 
> Seems to me, as an example, if I only ship to the
> US and Canada, that for my document only USA and
> CAN might be valid out of the list of all country
> codes. What benefit would I get from JAP and FRA
> being valid?
> 
> When an actual instance document is created for a
> UBL user, will we provide support for specifying
> further granularity of code list constraints?
> 
> > (3) Codelists as enumerated data types. This is my preferred approach - a
> > codelist is, in fact, an enumeration of specific semantics, and this
> format
> > makes it clear and easier to manage. What we need is an ability to extend
> > these  (a major failing of XML schema).
> 
> I have an enumerated type in my favorite schema language,
> but essentially its named values are treated as integers.
> 
> But I can also view code lists differently using what is
> termed a permitted alphabet constraint, a set of the sets
> of characters that determine what is valid for an instance
> of a given user defined type.
> 
> This allows me to express the valid sets of characters that
> can be used in a given field of some type, say as
> 
>    MyCodeList ::= UTF8String ("ABC" | "BAX", ... )
> 
> The "extension marker" ( ... ) instructs tools to also
> expect other values not in the list, so I do not need to
> code up an "Other" choice alternative.
> 
> But I am almost certain that such permitted alphabet
> constraints do not exist in XSD.
> 
> > Let me suggest:
> >
> > (1) Dedicated namespaces for codelists (one per codelist, or related group
> > of codelists)
> > (2) Alow these namespaces to be static - that is, not versioned.
> > (3) Have a "version" associated with the codelist in a way that does not
> > change the name of the namespace. (Could we use XSD "version" for this?)
> >
> > This way, we could version our structures and our codelists separately.
> > This models the best part of EDI, where it is common practice to update
> > codelists versions within an older version of message structures. And all
> > this, while not throwing away the ability to validate codelists with a
> > parser.
> 
> This seems a reasonable approach. But how is interoperability
> maintained when a code list item is removed? Are we affected if
> an item with one meaning in code list version A is given another
> meaning in code list version B? My question here is what happens
> in terms of interoperability if you are using A and I am using B?
> 
> Phil
> 
> > To subscribe or unsubscribe from this elist use the subscription
> > manager: <http://lists.oasis-open.org/ob/adm.pl>
> >
> 
> ----------------------------------------------------------------
> To subscribe or unsubscribe from this elist use the subscription
> manager: <http://lists.oasis-open.org/ob/adm.pl>