OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

ubl-ndrsc message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: [ubl-ndrsc]

Eve/NDR Folks:

Sorry this is late, but here is a brief write-up and suggested approach for
resolving the RT/CCT issues.




Representation terms/Core Component Types:

UBL needs to clearly define the role of CCTs and RTs, since the Core
Components spec fails to do this. I want to go over the history of the
development of thes concepts, so that we can make sure that we produce a set
of definitions that are sufficient to the needs of UBL.

Originally, there were *NO* CCTs. All we had in the CC work was a set of
semantic "primitives" that were called RTs. The list in the spec has not
changed much since then. At one of the ebXML meetings it was realized that
although some of the RTs were single bits of data, others were actually data
composites. For each RT, one (and exactly one, in theory) CCT was indicated,
to suggest the semantics of the properties needed to fully express the use
of the RT.

As we start trying to use the spec set up in this way (UBL and others), we
have been realizing that there are some significant failings in this system.
One great example of this is a Location Code, a common bit of data that
cannot be described as a set of enumerated values - what we traditionally
and typically think of as a 'code list' - even though it's business function
is that of a code.  

Core Components did not account for this phenomenon. Because it functions as
a "code", it has a semantic primitive RT of "Code" - this means that it has
a "Code" CCT, which allows you to point to a code-list and supporting

Unfortunately, this doesn't work. For this type of a "code" you need a
different set of properties in your physical expression of the business
data. (In the case of location code, this is a pattern, not an enumeration -
a different type of simple datatype in XSD, for example.)

This is true not only of codes, but also of other kinds of data. We need to
specify *both* the semantic primitive of the model (the Representation Term,
traditionally) *and* the set of properties by which it is made manifest (the
CCT), and there is neither a one-to-one relationship between RTs and CCTs,
nor a many-to-one relationship between RTs and CCTs. It is, by the practical
dictates of business information, a many-to-many, unless we substantially
increase the range of our RTs and CCTs. We must either extend these lists
significantly, or we must allow them to be combinatorial. We canot know how
to clearly define CCts and RTs until we understand whether they are
combinatorial or not.

There is another requirement that highlights this need, and it is something
that has not been addressed so far in UBL, other than as a comment against
the draft order from the LCSC.

This is the absolute need for a solid description of the physical
representation of data, at a finer level of detail than is currently

Take, for example, the degree of precision of a price. This is a fundamental
kind of data-type issue, since the degree of precision in prices defines the
tolerances used in the calculations for essential business processes such as
Order/ASN/Invoice reconciliation (aka "book-keeping").

If we are to describe a "price" using the current system, here is what we
would know about it:

RT = "Amount" (which is always a monetary amount according to the CC
CCT = "AmountType" (which gives us a number and a currency code)

The degree of precision of the price cannot be specified in the semantic
model, given these capabilities. All monetary amounts are the same. But in
reality, prices have a very different specificity than some other monetary
amounts. This means we cannot simply assume a single precision for all
monetary amounts, and make that part of our syntax-binding.

In other words, there is a need to capture in the model some distinction
between a price and another kind of monetary amount, since they have
different requirements in terms of how they are represented. Please note
that, in this example, precision is *not* syntax-specific. It is a critical
property of the business data itself, and can be described equally well in
many different syntaxes.


I would suggest the following approach to solving these difficulties:

(1) Have a set of Representation terms, which function as "semantic
primitives," as originally intended by the CC group in ebXML. (The list may
need to be altered slightly to include some missing types, but will not
undergo wholesale expansion). This indicates what the business purpose of
the data is, in an abstract sense, wholly separate from how it will be
represented when syntax bound.

(2) The list of CCTs should be expanded to reflect the actual needs of
expressing business data in a syntax, to cover those cases where the syntax
itself is not the determining factor, but rather the representation of the
business data (as is the case for numeric precision of prices). Alternately,
the CCTs could have properties added to them so that the range of possible
representation formats is one of the properties. This would work very neatly
for something like datetimes, for ecxample, but might become very confusing
and clunky in referring to numeric formats.

In either case, the range of expressive possibilities for CCTs should be

(3) Each representation term could be combined with some specified subset of
the available CCTs, as determined by the requirements of reality. An
"Identifier" RT could be a CodeType, or it could be a TextType, biut we
would no longer have ambiguous types like "IdentifierType". The CCT would
now define an exact set of properties that would describe the actual
representation of the data, rather than describing its semantic or business
function. (This is essentially abstracting the idea of simpleTypes in XSD up
a level, but still referring to the actual physical representation of the
data, rather than its semantic.)  This would result in a significant
increase in the number of CCTs (or their expressive range) and would result
in the removal of some of the existing ones. The current list assumed a
many-to-one relationship between RTs and CCTs that is not useable in
reality. We need to re-work the list to reflect this finding.


The names "Representation Term" and "CCT" are not particularly good, so if
you want to reverse them, you could. I wanted to avoid confusion within CC,
however, by not redefining them as their exact opposites. Traditionally, the
RT was the semantic primitive, and the CCT was a construct that encapsulated
the property set for representing that primitive.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC