ubl-comment message

Subject: [ubl-comment] Methodology Paper Comments

From: "Miller, Robert (GXS)" <Robert.Miller@gxs.ge.com>
To: "'ubl-comment@lists.oasis-open.org'" <ubl-comment@lists.oasis-open.org>
Date: Thu, 11 Jul 2002 17:12:48 -0400

Title: Methodology Paper Comments

Comments from Bob Miller on:
Position Paper: Library Content Methodology

Gentle people,

Overall, I found this position paper to be well formulated. I have been encouraged by the relatively formal approach taken within this team. This paper draws upon respected and proven information design principles (Relational Theory, Model Normalization, Object Classes). In a few places, this document seems to overlook its own principles, most particularly when accepting without careful examination some parallel work. An example is in sections 2.3.12.1 Applying Context to UBL and 3.3.12 Context:

IMO, the discussion in these sections should have pointed out that Contexts are properties of a class. ShippingContact and BillingContact simply constrain a Context property. Instead, I read in 3.3.2 "In many vocabularies, context is suggested by the component's name." And that's also what I see in the example from UBL vocabulary, two BIE's whose class is Contact, but whose context is "suggested by the component's name." "Suggested" doesn't cut the mustard! In fact, ShippingContact and BillingContact should be represented by subclasses of Context, each of which constrains a Context property of the parent class.

Perhaps the root source of my concern with the example of "Contact" is really found in this position paper's discussion and table 2 of 'Type" in section 2.3.10. The discussion observes that "data types are just another form of entity/object class/aggregate BIE." But then, it fixes a 'basic type' at too high a level (see for guidance XSD basic and derived types and note the properties these types establish and inherit.) And it suggests that a couple of layers of refinement are sufficient.

In my analysis of X12 vocabulary, I have found that (most) individual code list values identify 'semantic primitives'. They effectively point at a set of defining metadata, and they have no associated instance value (have no value property). Such primitives may of course appear in multiple code lists. And these code lists in turn are typically associated with entities which do carry associated instance values (have a value property). Bottom line is, if the semantic entity appears in a code list, and that list is associated with an entity that does have a property value, than that semantic primitive is a property of the entity with which it is associated. In X12, there are some 'basic business data (type) elements' like amount. In usage, they are sometimes associated with a code list. At other times, they appear without such association, but are embellished in the segment definition by a 'semantic note' that 'fixes' the value of one or more properties of the basic business data (type) element 'amount'. From a semantic viewpoint, TotalDollarAmount and Amount context="TL currency=:"US" are identical.

In designing a business document at a syntax-neutral level, if there is a need to express a total dollar amount, it is advantageous to express that need as an amount with specific property constraints on context and currency. Then, a syntax specific schema generator can use this information along with a set of grammer rules to generate an appropriate schema for instance representation. For example, a generator would likely have a rule that property constraints exceeding some (target syntax) threshold minimum set of choices results in generator of a choice of entities, each of which has a fixed property value. A property constraint that exceeds the threshold results in generation of a set of entities that allow/require the property value to be explicit in the data instance.

In Section 2.3.14 Assembling Document Definitions I find perhaps a more serious conflict with the foundation this paper lays. But before getting into that, let me suggest that the term 'document' as used earlier in this paper likely is not the same as 'document' as used in this section. I think the one or the other usage is inappropriate. I vote document for section 2.3.14, and something else for 2.3.3

I take some issue with the statement "An hierarchical, top-down and nested tree structure is still the most practical way to define any document's structure." I firmly believe that the most practical way to define any document structure is to "define one or more hierarchical views of the data to be represented in the document from a relational definition of the data." I think in principle that is what you meant to say, but I assert it is not what you said. If you don't like my definition, change the word 'design' to 'represent' in your definition and you at least won't raise my eyebrows. There is of course the nasty 'HL hierarchical loop' issue to address. Perhaps that disappears in a wave of 'context' applied to the logical model. I hope so.

IMO, the faults I find in this paper are neither in the foundation it lays, nor in the recommendation it makes. Its just a little of the detail I feel could use some cleanup.

Cheers,
Bob Miller

Follow-Ups:
- Re: [ubl-comment] Methodology Paper Comments
  - From: Tim McGrath <tmcgrath@portcomm.com.au>
- Re: [ubl-comment] Methodology Paper Comments
  - From: Lisa Seaburg <xmlgeek@gmi.net>