ubl-ndrsc message

Subject: [ubl-ndrsc] Arofan's RT/CCT Draft Definitions

From: "Burcham, Bill" <Bill_Burcham@stercomm.com>
To: ubl-ndrsc@lists.oasis-open.org
Date: Wed, 10 Apr 2002 06:44:48 -0500

I agree with just about everything Arofan says here right down to the very end where he makes the point about the terms being backward (that RT and CCT should properly be reversed). It was informative to finally hear such a lucid explanation of the history behind these two constructs. If we could get this sorted out to the satisfaction of UBL and CC folks this would be a huge step forward.

The only thing I'm left wondering about is this: in light of Arofan's proposal do we really need what is now called RT's at all?

What's the alternative (to dropping RT's) -- do we leave RT's in the UBL model and do they get reified as XSD (complex) types? And then do we go around extending those types with the CCT (complex) XSD types? If that's the case then how will I reuse one of the CCT's between two RT's -- for instance the Text CCT is gonna be a really widely used structure -- across many RT's. There isn't any multiple inheritance in XSD -- that's what you'd need in order to implement the many-to-many relationship between base and derived types suggested in Arofan's proposal.

An alternative is for the UBL _model_ to treat RT as an "adornment" or "stereotype" for documentation purposes but to elide it when it comes to the XSD model -- or to stick it in as a post-validation default attribute the way we're doing with uuid's. I don't want to get deep into exactly how we do that. I think we can work that out in a separate thread.

I propose we add a little clarification to the first leg of Arofan's proposal:

(1) Have a set of Representation terms in the UBL working model which function as "semantic
primitives," as originally intended by the CC group in ebXML. (The list may
need to be altered slightly to include some missing types, but will not
undergo wholesale expansion). This indicates what the business purpose of
the data is, in an abstract sense, wholly separate from how it will be
represented when syntax bound. When the UBL working model is syntax bound
into XSD the RT information will be represented in a manner similar to uuid's.
There will be no XSD types for RT's. There will be XSD types for CCT's.

-Bill

-----Original Message-----
From: Gregory, Arofan [mailto:arofan.gregory@commerceone.com]
Sent: Friday, April 05, 2002 2:14 PM
To: Maler, Eve; ubl-ndrsc@lists.oasis-open.org
Subject: [ubl-ndrsc]

Eve/NDR Folks:

Sorry this is late, but here is a brief write-up and suggested approach for
resolving the RT/CCT issues.

Cheers,

Arofan

________________________________________________

Representation terms/Core Component Types:

UBL needs to clearly define the role of CCTs and RTs, since the Core
Components spec fails to do this. I want to go over the history of the
development of thes concepts, so that we can make sure that we produce a set
of definitions that are sufficient to the needs of UBL.

Originally, there were *NO* CCTs. All we had in the CC work was a set of
semantic "primitives" that were called RTs. The list in the spec has not
changed much since then. At one of the ebXML meetings it was realized that
although some of the RTs were single bits of data, others were actually data
composites. For each RT, one (and exactly one, in theory) CCT was indicated,
to suggest the semantics of the properties needed to fully express the use
of the RT.

As we start trying to use the spec set up in this way (UBL and others), we
have been realizing that there are some significant failings in this system.
One great example of this is a Location Code, a common bit of data that
cannot be described as a set of enumerated values - what we traditionally
and typically think of as a 'code list' - even though it's business function
is that of a code.

Core Components did not account for this phenomenon. Because it functions as
a "code", it has a semantic primitive RT of "Code" - this means that it has
a "Code" CCT, which allows you to point to a code-list and supporting
properties.

Unfortunately, this doesn't work. For this type of a "code" you need a
different set of properties in your physical expression of the business
data. (In the case of location code, this is a pattern, not an enumeration -
a different type of simple datatype in XSD, for example.)

This is true not only of codes, but also of other kinds of data. We need to
specify *both* the semantic primitive of the model (the Representation Term,
traditionally) *and* the set of properties by which it is made manifest (the
CCT), and there is neither a one-to-one relationship between RTs and CCTs,
nor a many-to-one relationship between RTs and CCTs. It is, by the practical
dictates of business information, a many-to-many, unless we substantially
increase the range of our RTs and CCTs. We must either extend these lists
significantly, or we must allow them to be combinatorial. We canot know how
to clearly define CCts and RTs until we understand whether they are
combinatorial or not.

There is another requirement that highlights this need, and it is something
that has not been addressed so far in UBL, other than as a comment against
the draft order from the LCSC.

This is the absolute need for a solid description of the physical
representation of data, at a finer level of detail than is currently
possible.

Take, for example, the degree of precision of a price. This is a fundamental
kind of data-type issue, since the degree of precision in prices defines the
tolerances used in the calculations for essential business processes such as
Order/ASN/Invoice reconciliation (aka "book-keeping").

If we are to describe a "price" using the current system, here is what we
would know about it:

RT = "Amount" (which is always a monetary amount according to the CC
definitions)
CCT = "AmountType" (which gives us a number and a currency code)

The degree of precision of the price cannot be specified in the semantic
model, given these capabilities. All monetary amounts are the same. But in
reality, prices have a very different specificity than some other monetary
amounts. This means we cannot simply assume a single precision for all
monetary amounts, and make that part of our syntax-binding.

In other words, there is a need to capture in the model some distinction
between a price and another kind of monetary amount, since they have
different requirements in terms of how they are represented. Please note
that, in this example, precision is *not* syntax-specific. It is a critical
property of the business data itself, and can be described equally well in
many different syntaxes.

SUGGESTED APPROACH AND DEFINITIONS:

I would suggest the following approach to solving these difficulties:

(1) Have a set of Representation terms, which function as "semantic
primitives," as originally intended by the CC group in ebXML. (The list may
need to be altered slightly to include some missing types, but will not
undergo wholesale expansion). This indicates what the business purpose of
the data is, in an abstract sense, wholly separate from how it will be
represented when syntax bound.

(2) The list of CCTs should be expanded to reflect the actual needs of
expressing business data in a syntax, to cover those cases where the syntax
itself is not the determining factor, but rather the representation of the
business data (as is the case for numeric precision of prices). Alternately,
the CCTs could have properties added to them so that the range of possible
representation formats is one of the properties. This would work very neatly
for something like datetimes, for ecxample, but might become very confusing
and clunky in referring to numeric formats.

In either case, the range of expressive possibilities for CCTs should be
expanded.

(3) Each representation term could be combined with some specified subset of
the available CCTs, as determined by the requirements of reality. An
"Identifier" RT could be a CodeType, or it could be a TextType, biut we
would no longer have ambiguous types like "IdentifierType". The CCT would
now define an exact set of properties that would describe the actual
representation of the data, rather than describing its semantic or business
function. (This is essentially abstracting the idea of simpleTypes in XSD up
a level, but still referring to the actual physical representation of the
data, rather than its semantic.) This would result in a significant
increase in the number of CCTs (or their expressive range) and would result
in the removal of some of the existing ones. The current list assumed a
many-to-one relationship between RTs and CCTs that is not useable in
reality. We need to re-work the list to reflect this finding.

NOTE:

The names "Representation Term" and "CCT" are not particularly good, so if
you want to reverse them, you could. I wanted to avoid confusion within CC,
however, by not redefining them as their exact opposites. Traditionally, the
RT was the semantic primitive, and the CCT was a construct that encapsulated
the property set for representing that primitive.

----------------------------------------------------------------
To subscribe or unsubscribe from this elist use the subscription
manager: <http://lists.oasis-open.org/ob/adm.pl>

Follow-Ups:
- Re: [ubl-ndrsc] Arofan's RT/CCT Draft Definitions
  - From: Tim McGrath <tmcgrath@portcomm.com.au>
- [ubl-ndrsc] Draft structure for new NDR
  - From: Mavis Cournane <Mavis.Cournane@csw.co.uk>