ubl-clsc message

Subject: Re: [ubl-clsc] Re: [ubl-lcsc] Draft of code list paper, revised after face to face
From: "Anthony B. Coates" <abcoates@londonmarketsystems.com>
To: ubl-clsc@lists.oasis-open.org
Date: Sun, 07 Mar 2004 17:47:36 -0000
Better late than never, I've been reading the 20040303a version of the 
codelist document, and I thought I would post some comments.  I'm not 
going to go into as much detail as I might usually, though, as I have 
heard that at the last LCSC meeting, a lot of decisions about code lists 
were made, and I'm still waiting to see the minutes of that meeting.  I've 
made these notes in document order, not in order of importance.

0.: It would be easier to read the requirements if the 2nd level headings 
stood out from the 3rd level headings.

2.: The UBL code lists can't follow the classic "design from requirements" 
approach, since (a) they must follow the CCTS design from ebXML and (b) 
they must follow the NDR rules which are not based on requirements.  In a 
future version of this document (probably after UBL 1.0) it would be good 
to try and capture more explicitly any requirements are inherited from 
CCTS or NDR, include technical requirements to the effect that the design 
must be CCTS/NDR compliant, and then list the requirements that are purely 
for code lists.  The current grouping of requirements is a good start, but 
I don't think it quite makes the contextual applicablity of the 
requirements clear enough, particularly for someone who wants to 
understand how the design fulfils (or derives from) the requirements.

2.1: I think "cognizant" should be "cogniscent".

2.1: "However, a single code list may not be required to meet all 
requirements simultaneously" - this suggests to me that there is a mixture 
of requirements here.  Some requirements relate to the generic code list 
data model, some are CCTS/UBL specific (is that the right split?).  If a 
code list can be allowed to break certain requirements, it needs to be 
clear why.  Presumably, it is because in some contexts the requirement 
does not apply.  I know there is an intention to provide something in 
future on this point in a separate section (#5), it's just that I think it 
would be better to capture the contextual grouping of requirements up 
front in section #2.

2.2.1: "As first-order business information entities (BIEs)." - is this 
supposed to be a phrase or a sentence?  I doesn't read clearly to me as it 
stands.

2.2.2: "As second-order information that qualifies some other BIE" - 
again, I would find a sentence easier to read.

2.2.2: "<Currency code=”EUR”>2456,000</Currency>" - what is this example 
supposed to be?  Is "2456,000" a code, or an amount (hopefully not an 
amount, since XML Schemas don't support commas in amounts)?  It isn't at 
all clear to me, so perhaps we need to either choose a different example, 
or describe this one with just enough extra information to set the context.

2.2.3: Although I like the text here (OK, I was responsible for much of 
it), for me it goes beyond being a requirement.  Much of the text is 
aspirational.  Perhaps at some stage we could remove the text which is not 
absolutely necessary for the requirement, and see if we can put it 
elsewhere.

2.2.4: What I don't like about this requirement is that it makes a 
requirement to support forms, and then also makes a technical decision 
that this requires XML Schema support.  I don't believe that requirements 
should impose solutions, not unless the imposition of a particular 
solution is itself the requirement (as the requirements to support CCTS & 
NDR will be).

2.2.6: I think this requirement is a mixture of two things.  There need to 
be conformance tests, but that applies whether or not there is a data 
model.  A data model can be used as the input to the authors of the 
conformance tests, which is a good idea, but I also think there needs to 
be bi-directional traceability from requirements <--> data model <--> 
physical implementations (e.g. Schemas).  We don't yet have a mechanism to 
give us that traceability, I think.

2.2.7: This is a good case of a contextual requirement.  It is not at all 
a requirement on the code list data model, nor on the code list Schemas.  
It is a requirement on UBL document Schemas, that they provide for the 
inclusion of such information.  As such, I would say this is an NDR 
requirement, albeit one that will be be fulfilled by a design guideline   
 from CLSC.

2.3.3: There is no way we could stop the construction of private code 
lists (although we could make it easier or harder).  What I guess this 
requirement should say is that there is a requirement for private code 
lists to be supported by UBL.  There may also be a requirement for code 
lists to be sufficiently straightforward that 3rd parties can reasonably 
create their own private code lists without paying for consulting time   
 from a CLSC member.

2.4: Do we really use the weighted points system any more?  I didn't think 
so.

2.4: The text in this section reads more like a glossay than like a set of 
requirements.  I found that a bit confusing when reading it.

2.4.1: I'm afraid the text here doesn't help me to understand the 
requirement.

2.4.9: This doesn't seem like a sensible example, not with some 
contextualisation (i.e. "the traditional english list of colours in a 
rainbow").

2.5.1: I thought we agree that we could have code lists with zero or 1 
entries, rather than a minimum of 2.  It is poor design to use a constant 
to represent a list of length 1, if that list could grow in future.  When 
the list grows from 1 to 2 members, it then forces structural changes 
(from constant to list) in every applications that uses it, and that can 
be expensive.

3: Reading over this section again, I found it confusing, because what is 
presented isn't really a data model of a code list, because it doesn't 
include any mention of the codes.  I guess it is a model of information 
required to uniquely identify a code list, but that needs to be made 
clear, as it isn't something a reader would automatically expect.

4. Since we don't describe the full code list data model anywhere, it 
seems odd to say that this section describes the mapping to XML Schema.

Finally, in terms of how we move forward with actually modelling code 
lists (rather than just the identification data) after 1.0, the idea that 
I have been tossing around in my head of late is that we should define 
code lists using a tabular model.  Each row corresponds to an individual 
enumerable value in the code list.  Each column corresponds to a piece of 
data about a value.  Although the data in columns might often be simple 
data (e.g. integers, strings), there is no reason to exclude complex data 
(e.g. XML fragments).

In this model, a "key" would be any set of 1 or more columns which 
together uniquely identify all members of the code list.  So a code list 
might have multiple keys, and some of those keys might be compound keys 
involving more than one column.  This approach differs from what people 
have considered so far, where some columns are "content columns", and some 
are "index columns".  In practice, one person's index is another person's 
content, and it doesn't help to try and make such a distinction.  All that 
really matters is that some columns (or sets of columns) can be used to 
uniquely identify an enumerable "value" (row) from the code list.

Does that make sense?  I haven't quite got the language right, but I hope 
I got the idea across.  The underlying problem is very similar to the 
problems people solve with relational databases, so I think a similar 
model would lead to very workable solutions.

	Cheers,
		Tony.

--
Anthony B. Coates
London Market Systems Limited
33 Throgmorton Street, London, EC2N 2BR
http://www.londonmarketsystems.com/
mailto:abcoates@londonmarketsystems.com
Mobile/Cell: +44 (79) 0543 9026
[MDDL Editor (Market Data Definition Language), http://www.mddl.org/]
[FpML Arch WG Member (Financial Products Markup Language), 
http://www.fpml.org/]
-----------------------------------------------------------------------
This Email may contain confidential information and/or copyright material 
and is intended for the use of the addressee only.
Any unauthorised use may be unlawful. If you receive this Email by mistake 
please advise the sender immediately by using the reply  facility in your 
e-mail software.
Email is not a secure method of communication and London Market Systems 
Limited cannot accept responsibility for the accuracy or completeness of 
this message or any attachment(s). Please examine this email for virus 
infection, for which London Market Systems Limited accepts no 
responsibility. If verification of this email is sought then please 
request a hard copy. Unless otherwise stated any views or opinions 
presented are solely those of the author and do not represent those of 
London Market Systems Limited.
References:
- Re: [ubl-lcsc] Draft of code list paper, revised after face to face
  - From: Burnsmarty@aol.com