OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

clr-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [clr-dev] Philosophy behind the CVA contracts


At 2011-02-08 15:17 -0600, ericdes wrote:
>Hi Ken,
>
>I understand that ISO added a numeric value in the country list, 
>probably because it's a best practice to have a primary key that 
>doesn't carry a meaning in the relational database world.
>
>But if these keys exist, then programs should use them. The problem 
>is we all know there are reasons why people won't use them IN 
>GENERAL. And I believe the CVA contracts are a much better way of 
>representing the world as it is, i.e. denormalized.
>
>I dreamed about doing things differently. I had this UBL sample in mind:
>
><PostalAddress>
><Country>
><IdentificationCode>US</IdentificationCode>
></Country>
></PostalAddress>
>
> From this sample, the XML record tells us that we're looking at an 
> address label, and that we're going to derive the country name from 
> its identification code. Country becomes CountryName in my mind 
> because of the context set by PostalAddress. I'm then looking up 
> the code US in the ISO list and find 'UNITED STATES'. And following 
> my reasoning my label should say 'U.S.A.' because it's how I want 
> to write the country name of 'UNITED STATES' on a mailing label.
>
>If a CVA partner wants to use the code 'USA' and that our 
>agreed-upon code list says:
>
>       <Row>
>          <Value ColumnRef="code">
>             <SimpleValue>USA</SimpleValue>
>          </Value>
>          <Value ColumnRef="name">
>             <SimpleValue>UNITED STATES</SimpleValue>
>          </Value>
>       </Row>
>
>Then a program could easily find that it relates to the same country 
>because the value 'UNITED STATES' is identical in both lists. But 
>it's implying that 'UNITED STATES' becomes kind of a universal key 
>and it's a really bad practice to use strings for keys.
>
>So let's take the case where our agreed-upon code list says:
>
>       <Row>
>          <Value ColumnRef="code">
>             <SimpleValue>USA</SimpleValue>
>          </Value>
>          <Value ColumnRef="name">
>             <SimpleValue>United States of America</SimpleValue>
>          </Value>
>       </Row>
>
>It's obvious that human minds will make the link that both codes 
>relate to the same country, but computers will get confused. Also, 
>it's strange to notice that the ISO writes 'UNITED STATES' when the 
>unambiguous name is 'United States of America'. These kinds of 
>approximations will probably occur in customized agreed-upon code 
>lists and don't play well with computerization.
>
>I think that code lists should link the codes to unambiguous 
>definitions. A bit like this:
>
>       <Row>
>          <Value ColumnRef="code">
>             <SimpleValue>USA</SimpleValue>
>          </Value>
>          <Value ColumnRef="name">
>             <SimpleValue>UNITED STATES</SimpleValue>
>          </Value>
>          <Definitions>
><Definition>http://en.wikipedia.org/wiki/United_States</Definition>
><Definition>http://legal-dictionary.thefreedictionary.com/United+states+of+america</Definition>
>          </Value>
>
>       </Row>
>
>Rather than publishing a universal primary key that might work for 
>countries but won't work for many other entities. We could imagine a 
>program knowing some of these definitions and being able to 
>understand external data through whatever they possess in memory.
>
>Eric.

Using URIs is a very common way of identifying things ... in Topic 
Maps there is the concept of the Published Subject Identifier (PSI), 
which is a URI.  If you created a PSI for your concept (it could be 
the Wikipedia URI) and put that in one of your columns it would look like:

    <Row>
       <Value ColumnRef="code">
          <SimpleValue>USA</SimpleValue>
       </Value>
       <Value ColumnRef="name">
          <SimpleValue>UNITED STATES</SimpleValue>
       </Value>
       <Value ColumnRef="PSI">
          <SimpleValue>http://en.wikipedia.org/wiki/United_States</SimpleValue>
       </Value>
     </Row>

But how is that PSI really different from the ISO number for a 
country?  Both are just a lignuistically-neutral value for the 
concept.  And all programs would need to know the PSI just as they 
would need to know the number.

Aren't your definitions in fact primary keys, just URIs instead of numbers?

Interestingly, in Topic Maps you can declare that two 
differently-expressed PSI strings are representative of the same 
semantic concept ... it is an aspect of merging topic maps ... it 
helps to infer new relationships between pieces of information not 
found in either of the individual Topic Maps.

But genericode expresses a simple taxonomy or ontology ... nothing as 
expressive as a Topic Map.

Do you think there is something missing from genericode/CVA that 
would help with your problem?  I'm not quite sure what the problem is 
that is being addressed in this thread.

I hope this helps.

. . . . . . . . . . . Ken

--
Contact us for world-wide XML consulting & instructor-led training
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/c/
G. Ken Holman                 mailto:gkholman@CraneSoftwrights.com
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]