ubl-lcsc message

Subject: Re: [ubl-lcsc] Document instance UUID

From: Chin Chee-Kai <cheekai@softml.net>
To: Chiusano Joseph <chiusano_joseph@bah.com>
Date: Fri, 07 Mar 2003 11:59:39 +0800 (SGT)

On Thu, 6 Mar 2003, Chiusano Joseph wrote:

>>On the other hand, if the documents were to be stored in a registry
>>(such as an ebXML Registry), they would automatically be assigned a UUID
>>by virtue of the fact that they are a RegistryObject.

Thanks for the comments.  Fully agree that real life human
references to documents require the more familiar short-form
serial numbers.  As a side note, one of our company trial 
proposals to refer to people using UUID in a project was met
with no better than  frowning faces from the end-users. 
The idea here is not to have UUID replace any of the IDs, 
but an additional meta-data to facilitate UBL-processors.

It's fine if the documents are assigned further UUIDs or 
various other tracking numbers within other systems (ebXML 
registry or otherwise) that are meaningful to those systems,
but my concerns are within the scope  of UBL, there appears 
to be no defined means of saying one document is "equal" to 
another.  To do that, we need a consistent way to first 
address/identify each document instance within the specs of UBL.  

We carried out an XML Industrial Project earlier last year
in Singapore under the premise of IT Standards Committee
(http://www.itsc.org.sg/downloads/xip/xip_index.html), and
performed some trial exchange of document instances under the
XIP project.  Some of the experience learned from that 
small-scale trial was that when we generated and sent out 
document instances and particularly in close repetition, the 
receiving  processor must decide if they are duplicates and 
transport level, or are genuine separate instances.  

Should the processor look at all pairs of document instances 
to identify duplicates?  What number of parameters must the 
processor examine before the document instances are concluded 
as the same (ie, is it just looking at sender, recipient and 
time, or also the items, quantities, unit-price, etc, or 
entire document, which becomes very processing intensive)?
If timing were used for document distinction (the rest of
document being the same), how far apart in terms of timing
must the document instances be in order for the instances
to be considered "separate instances"?
If the document serial ID (eg. purchase order ID, invoice 
number etc) was manually keyed-in (as we simulate some 
local practice for small companies), the chance of ID 
re-use (consciously or accidentally) was not insignificant.
And we thought it's easier to mandate XIP software to 
generate unique IDs (using UUID) than to mandate end-users 
to check that IDs are unique (which defeats the purpose of
automation).  It's also a form of "self-defence" mechanism
so that document instances circulating within XIP were
"clean" (unique).

I thought sharing this experience here would be useful
for a conscious choice between incorporating or excluding
UUIDs within UBL document instances.

Best Regards,
Chin Chee-Kai
SoftML

Follow-Ups:
- Re: [ubl-lcsc] Document instance UUID
  - From: Chiusano Joseph <chiusano_joseph@bah.com>

References:
- Re: [ubl-lcsc] Document instance UUID
  - From: Chiusano Joseph <chiusano_joseph@bah.com>