ubl-lcsc message

Subject: Preliminary Notes on Code List Schema Implementation
From: Chin Chee-Kai <cheekai@softml.net>
To: UBL LCSC <ubl-lcsc@lists.oasis-open.org>
Date: Wed, 17 Sep 2003 13:21:47 +0800 (SGT)
These are quick preliminary notes I gathered through some initial
implementation of code list schemas and linking them into
the Reusable and Model documents.  These are notes more to
myself explaining some micro-decisions made up on the fly
just to get things done.  But I think it'd be useful to share
with the list just to check on those areas.




Code List Catalogue:
====================
The filename will be labeled as (for 1.0-alpha release):

	UBL-CodeListCatalogue-1.0-alpha-draft-1.xls

and stored in the same "sss" directory as other model spreadsheets.



Inside the file:
================
1. The Code List Catalogue sheet will be named as
"CodeListCatalogue" instead of current "Sheet1".

2. The column header descriptions will be flushed at Row 1
to be in line with how the other model spreadsheets are laid out.
Extra blank rows currently sitting above the column headers
should be removed.


3. The descriptions of the column headers should be a single-lined
description without containing newlines.  For instance, the
"Documentary
Namespace Prefix"  currently with a newline after "Documentary"
should be written as just "Documentary Namespace Prefix" and
rely on the cell to format the presentation to break into 
2 or more lines as necessary.


4. The first 3 column header descriptions will be:

"Code List Namespace", "Documentary Namespace Prefix"
and "Code List Default".


5. Data rows (non-column-header) that have empty cells on
ANY one or more of the 3 columns indicated in (3) above will
be ignored.   With this mechanism, it is alright to leave
descriptive texts etc as per what is already there now.




Namespace Construction:
=======================
A. A given code list, whether internal or external, will be
uniquely and normatively identified via an assigned namespace
value that is stored under the "Code List Namespace" column.

B. The namespace value *may* be a function of the code list name,
version, agency, plus other anti-collision variables to make
two lists uniquely identified.  However, whether two given 
code lists are "alike" or not is a UBL-human decision.  
Complications (non-technical) could be from minor version updates,
or two completely exact enumerations but provided by two different
agencies.  This decision to disambiguate two code lists will be
implemented through assigning same or different namespace values 
to the code lists should they be deemed  the same or different 
code lists respectively.

C. Namespace values as stored in "Code List Namespace" now are
derived from *some* kind of dictionary entry name.  Those values,
when they do not contain any colon ":" character, will be 
internally imported into UBLish memory through another
derivation mechanism that makes them UBL-ish.  For example,
the "seen" namespace value for

	"DespatchAdvice. Type.. Code"

will be

"urn:oasis:names:tc:ubl:codelist:DespatchAdviceTypeCode:1.0:1.0-alpha"

However, if the value of "Code List Namespace" contains at
least a colon ":" character, then that value will be imported
into the schema generation tool as-is without further derivation.


D. In the next version of Code List Catalogue, the "Code List Namespace"
will be a normative value assigned by UBL to refer to any specific
code list whether or not it is internally or externally managed.

E. The "Documentary Namespace Prefix" will be still non-normative.
However, they are required to be unique across all rows in the
Code List Catalogue, and their contained namespace value must be
the same through out all UBL usages.






XSD Imports Within Reusable & Document Schemas:
===============================================
(I). Due to *final* namespace value being the long form that may
be very cumbersome to serve as filenames, the mechanism used to
systematically derive the schema location file for each local
storage will be standardized as the proposed formula:

"UBL-" + prefix + "-" + UBLVersion [+ "-draft-" + draftNumber] + ".xsd"

The square brackets denote work-in-progress draft versions that
would be absent in final release.

So for instance, the  Shipment Priority Level Code with a prefix
defined as "spl" will have a mechanically derived filename of 

	UBL-spl-1.0-alpha-draft-1.xsd

and its <xsd:import> will look like the following:

  <xsd:import
namespace="urn:oasis:names:tc:ubl:codelist:ShipmentPriorityLevelCode:1.0:1.0-alpha"
schemaLocation="UBL-spl-1.0-alpha-draft-1.xsd" /> 



(II).  Any prefixes referenced within Reusable or Document 
spreadsheets that are undefined or whose corresponding namespace
value cannot be found or is empty in the Code List Catalogue
will be flagged at the <xsd:import> block.  One example is:

<!--  ERROR: Empty or undefined namespace value for prefix reference
"eph". 
  --> 



(III).  For Document spreadsheets, the ABIE's are tagged with
prefixes that are intended for instance-processing and have no
relationship with code lists.  However, the code list catalogue
mechanism is being "borrowed" to ferry the data into the schema.

It is true that the tool *can* tell or guess correctly that
"da", "in" etc should reference the namespace value of the
document itself (the targetNamespace).  However, this will
make the document ABIEs a special type of ABIE.  If Reusable
ABIEs use the same mechanism, they become treated differently.

Is this better/worse?  

As it is now, the code list generation logic is giving errors 
like:

<!--  ERROR: Empty or undefined namespace value for prefix reference
"in". 
  --> 

As such, and also as a consistency checklist, I'd like to suggest
that the Code List Catalogue have entries for prefixes associated
with the 8 documents, such as "da" (Despatch Advice, but what's
the namespace value?), "in" (Invoice, but what's the namespace
value?), etc, plus one for CCT & core component parameters just
to be complete.    If this is less preferable, we might need to
consider adding (yet another!) column in the Reusable and Model
spreadsheets with heading "Documentary Instance Prefix"
and distinguish its use from that of "Documentary Namespace Prefix"
whose purpose will then be delgated solely for code list 
prefixes.

Any comments?  Please.





Type Reference Errors:
======================

(i) For general type references that UBLish cannot find a
declaration for, an error comment block will be inserted into
the bottom chunk of global element definitions to indicate
the error.  The corresponding <element> that references an
unknown type will have its "type" attribute set to 
"unknown:[TypeRefName]Type"  where [TypeRefName] is replaced
with the local name of the originally intended type recorded
in the spreadsheet.   Thus, two examples found are:

<!--  ERROR: Unknown type reference "ReferenceType" required by element
"CatalogueReference". 
  --> 
<!--  ERROR: Unknown type reference "GuidType" required by element
"GloballyUniqueIdentifierGuid". 
  --> 



-----------------------------------
In general, it would be necessary to ensure that there are 
no <!-- ERROR: ... -->  or <!-- WARNING: ... --> blocks
generated in order for the schema to work properly.
-----------------------------------


I'll send some interim results out soon.




Best Regards,
Chin Chee-Kai
SoftML
Tel: +65-6820-2979
Fax: +65-6743-7875
Email: cheekai@SoftML.Net
http://SoftML.Net/
Follow-Ups:
- Re: [ubl-lcsc] Preliminary Notes on Code List Schema Implementation
  - From: Anne Hendry <anne.hendry@sun.com>