ubl-ndrsc message

Subject: Re: [ubl-lcsc] Re: [ubl-ndrsc] UBL: question on CCT language component

From: Tim McGrath <tmcgrath@portcomm.com.au>
To: Stephen Green <stephen_green@seventhproject.co.uk>
Date: Thu, 04 Mar 2004 11:59:15 +0800

I agree with Stephen on this.

In reviewing the CCT/SDT and UDT schemas we have been using up to draft 4 of 1.0, it appears we have dropped all use of normalizedString.
this is despite the resolution in San Francisco that token is too restrictive. From reading the XML Schema Primer (part two), i gather that...
string = a set of finite-length sequences of characters
normalizedString = strings that do not contain the carriage return (#xD) nor tab (#x9) characters
token = strings that do not contain the line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces

What i understood we agreed to is that the XSD representation of the content of core component types should be:

Amount. Content = xsd:decimal
Binary Object. Content = xsd:base64Binary
Code. Content = xsd:normalizedString (currently this still says xsd:token)
Date Time. Content = xsd:dateTime
Identifier. Content = xsd:normalizedString (currently this still says xsd:token)
Indicator. Content = xsd:boolean
Measure. Content = xsd:decimal
Numeric. Content = xsd:decimal
Quantity. Content = xsd:decimal
Text. Content = xsd:string

I too am concerned that we seem to revert back every time. The solution is to drive these schemas from the Library rather than have then added-on later.
I suggest we correct these for draft 6 and propose that we do this by making the mapping part of the CCT/SDT/UDT models.

As i mentioned in the discussion we had in Washington on the Friday morning, all supplementary components resolve to be a ". Content" for one of these core component types. So a supplementary component called "Numeric. Format. Text" is a "Text. Content" and a supplementary component called "Measure Unit. Code List Version. Identifier" is of "Identifier. Content", and so forth.
This means that theoretically we need only define datatypes as listed above and the supplementary components will take their xsd:datatypes from these.
However, it appears we have chosen in some cases to shortcut this principle by using inbuilt XSD datatypes. So we find that if the representation is "Identifier" and the property term is "Uniform. Resource" we use xsd:anyURI and if the property term is "Language" we use xsd:language.

The nett result of this will be that we do not have xsd:token anywhere - which is fine by me.

Does this agree with your findings, Anne?

Stephen Green wrote:

Anne

I'm worried that in all this we might inadvertently drop the LC/NDR
resolution to use xsd:normalizedString wherever there is the possibility
that more than one space need be preserved (having business meaning).

I think it was decided for Identifier but there was less certainty about
using it for Code.

*I note that most Supp Comps are identifiers.*
Perhaps this got neglected as changes were made to the Schemas and
then the omissions reinforced as we discussed xsd:token versus xsd:string.

Steve

----- Original Message -----
From: "Anne Hendry" <anne.hendry@sun.com>
To: <ubl-lcsc@lists.oasis-open.org>; <ubl-ndrsc@lists.oasis-open.org>
Sent: Saturday, February 28, 2004 6:16 AM
Subject: [ubl-ndrsc] UBL: question on CCT language component

   Hi,

In matching up the UBL xsd datatype assignments to cct types, I could
use some clarification on the 'language*' components.  All other
components listed in table 8-2 of ccts 2.01 have a 'content' component
and then some supplementary components.  However, language* components
seem to be all supplementary, with no content component - they just
appear as 'Language.Identifier' and 'Language.Locale.Identifier'.  So,
then, going through the xsd representation of the Code type in table
8-2, for example, I have all the attributes (supplementary components)
of Code type accounted for, but there is one extra in the schema that is
not in the cct 8-2 table.  That is 'languageID'.  Here is the xsd for
the 'Code' element:

- <xsd:simpleContent>
- <xsd:extension base="xsd:token">
  <xsd:attribute name="listID" type="xsd:token" use="optional" />
  <xsd:attribute name="listAgencyID" type="xsd:token" use="optional" />
  <xsd:attribute name="listAgencyName" type="xsd:token" use="optional" />
  <xsd:attribute name="listName" type="xsd:token" use="optional" />
  <xsd:attribute name="listVersionID" type="xsd:token" use="optional" />
  <xsd:attribute name="name" type="xsd:token" use="optional" />
  <xsd:attribute name="languageID" type="xsd:language" use="optional" />
  <xsd:attribute name="listURI" type="xsd:anyURI" use="optional" />
  <xsd:attribute name="listSchemeURI" type="xsd:anyURI" use="optional" />
  </xsd:extension>
  </xsd:simpleContent>

The definition of the 'Language.Identifier' component in the ccts 8-2
table defines it as "The indentifier of the language used in the
corresponding text string."  Well, ok, there is a 'text' string (name)
in the schema right above this that corresponds to the ccts entry in 8-2
'Code.Name.Text'.  However, there are other 'text' strings (eg. in
BinaryObject Name, etc) in 8-2 tha t don't seem to have a 'languageID'
attribute attached.

So my questions are:

a) Why does Language.Identifier not have a content component?  It seems
like somewhat of a free-floating supplementary component the way it is

now.

b) When/how do we choose to use the Language.Identifier?
c) Along with Language.Identifier in table 8-2 there is
'Language.Locale.Identifier', also a seemingly free-floating
supplementary component.  Is it expected that wherever there is a
Language.Identifier attribute needed/used there should also be a
Language.Locale.Identifier?

I am only looking at the cct schema. I'm assuming the rep terms schema
is going away.  Is this a correct assumption and I should only look at
cct?  What about the cc parameters schema - will that be auto-generated
(or will that one be removed too)??

Thanks,
An ne


To unsubscribe from this mailing list (and be removed from the roster of

the OASIS TC), go to
http://www.oasis-open.org/apps/org/workgroup/ubl-ndrsc/members/leave_workgro
up.php.

    



To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/ubl-lcsc/members/leave_workgroup.php.

-- 
regards
tim mcgrath
phone: +618 93352228  
postal: po box 1289   fremantle    western australia 6160