OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ubl message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: The current discussion about Specialised Data Types seems to focus on code lists issues.


The current discussion about Specialised Data Types seems to focus on code lists issues. But they are to be used for any other restriction (length and other facets) of a Data Type as well. This is a CCTS concept and this is what Tim's spreadsheet Specialised Data Types  has been made for, I hope.
Example: A data type with the official Customs definition and restrictions can be reused for different ABIEs.
If a codelist can be used by two different Data Types AND we want to work redundancy-free, then code lists and Data types should become separate objects.
Michael
 
 -----Ursprüngliche Nachricht-----
Von: Tim McGrath [mailto:tmcgrath@portcomm.com.au]
Gesendet: Mittwoch, 17. März 2004 11:09
An: Stephen Green
Cc: ubl@lists.oasis-open.org
Betreff: Re: [ubl] Specialised DataTypes Schema Module

In reponse to this, I want to make the point that it has been agreed that we need Specialised Data Types for Code Lists. The question is whether we need to define them in two places - once in a combined schema and once in individual schemas for each code list.  As has been expressed on several occassions, we definitely want to have separate schemas for each code list. So, the real question is, do we need the combined SDT schema?

Secondly, this issue is not a matter of changing existing practice.  we have never used the SDT approach before, so whatever we do is new and therefore untested.  Your conclusion recommends to *not* remove the SDT Schema - i would say you are actually putting the case to add it.  it is not there in 1.0-Beta (except as a placeholder that was not referenced) and it is not required by the code list representation mechanism (or the CCTS - to get that argument out of the way).  Given our committment to have separate schemas for each code list module - this combined SDT schema is an additional and new module, not an existing artifact we wish to retain.

To support its value, i had hoped you would....
a. provide an example of what the combined SDT schema should contain to demonstrate its value - clearly the one we generate now is not suitable.
and/or
b. use a complete example - by not removing DerivedCodeType you make it hard to contrast, not easier.

However I think that even so, you would agree there are no architectural reasons why we need a combined SDT schema module and separate Schema Modules for each code list.

The entire argument for using a combined SDT schema seems to be its possible future value for customisation.  This is based on the idea that the combined SDT acts as an index to the code list schemas.  I think the argument goes... if we change a code list schema we need only change the index to update the document schemas.  I am not sure you make the case that this is essential for introducing substitution groups.  in fact, the CLSC paper's examples are based on not having the combined SDT module - so clearly we can use the substitution group method of extension without it.

The real point should be whether having a combined SDT schema makes introducing different code lists easier.

Please note that we already have an index for mapping logical codes to code lists.  This is our specialised data type spreadsheet ( and the corresponding EDIFIX model).  

The way this works currently is...
a. We establish a UBL data type (a qualified CCTS data type) for the BBIE.  Using your example, we have Currency_ Code. Type for our Invoice. Transaction Currency. Code.
b. We also define a logical mapping from this to a specific, physical code list using our Specialised Data Type model.  So Currency_ Code. Type becomes bound to the ISO 4217 Alpha version 0.3 code list.  This binding is both by code list, namespace and by namespace prefix.  For example, the UBL 1.0 Currency_ Code. Type has a codelistID of "ISO 4217 Alpha", a namespace prefix of "cur:". and a CodeListSchemeURI of "urn:oasis:names:tc:ubl:codelist:CurrencyCode:1:0".
c. The schema generator them uses these to assembly the structures.

It is worth reminding oursleves that there is now and will be forever more, only one UBL 1.0 specialised data type model and only one UBL 1.0 Currency_ Code . Type defining one set of UBL 1.0 Currency Code values (the ones stated in ISO 4217 Alpha 0.3). Any changes or customisation mean a different set of code values and this means a different code list and therefore a different namespace.  So, if someone wanted to use ISO 4271 Numeric Codes,  they can use Currency_ Code. Type but they must have their own namespace for this and a different own specialised data type model to map it.  Alternatively, they can define a Numeric Currency_ Code. Type and keep the two options logically and physically separate - which seems more sensible.  either way, both methods must use their own namespaces.

And exactly the same options would apply to someone trying to use their own set of code values.

So what difference does introducing a combined SDT schema make?

Well, without a combined SDT schema, if someone wants to hand craft schemas for their own code lists, then they would have to change the document schemas (to replace the namespace - but not the prefix).  Some people may consider this a feature.  Using another set of code values from the ones published in 1.0, makes this a non-compliant UBL 1.0 document schema.  There is no assurance of  interoperability.  So having an edited/different document schema (and corresponding chnages to it namespace) makes it clear that it is a customized implementation.

However, if you want to hide this customisation and reduce the amount of editing required, then a combined SDT schema acting as an index between the logical names and physical namespaces would be the way to do it.  With this method, if we want to make Currency_ Code. Type the ISO 4217 Numeric version we can modify the combined SDT schema to indicate the new namespace/location.  The Invoice schema still thinks it is using CurrencyCodeType, but it picks up a different Code List schema.

To do this effectively, the combined SDT schema should only describe this logical to physical mapping and leave all other metadata to the code list schema itself.  otherwise we will get them out of synch.  as with most indices they should just be pointers and have no supplementary information themselves.  So i would expect to see the combined SDT schema as something like....

<xsd:schema targetNamespace="urn:oasis:names:tc:ubl:SpecialisedDatatypes:1:0-draft-8.3" xmlns:cur="urn:oasis:names:tc:ubl:codelist:CurrencyCode:1:0-draft-8.3" xmlns:ccts="urn:oasis:names:tc:ubl:CoreComponentParameters:1:0-draft-8.3" xmlns="urn:oasis:names:tc:ubl:SpecialisedDatatypes:1:0-draft-8.3" xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified" version="1:0-draft-8.3">
    <xsd:import namespace="urn:oasis:names:tc:ubl:CoreComponentParameters:1:0-draft-8.3" schemaLocation="UBL-CoreComponentParameters-1.0-draft-8.3.xsd"/>
    <xsd:import namespace="urn:oasis:names:tc:ubl:codelist:CurrencyCode:1:0-draft-8.3" schemaLocation="../codelist/use/UBL-CodeList-CurrencyCode-1.0-draft-8.3.xsd"/>
    <xsd:complexType name="CurrencyCodeType">
        <xsd:simpleContent>
            <xsd:extension base="cur:CurrencyCodeType"/>
        </xsd:simpleContent>
    </xsd:complexType>
</xsd:schema>
(obviously with entries for the other code lists as well.)

If i got this right it should say "what the document schema calls sdt:CurrencyCodeType is actually cur:CurrencyCodeType".   Whilst this example seems trite, we could have different names for the mapping.  If someone wanted to adopt ISO4217 Numeric codes we could change this to....

<xsd:schema targetNamespace="urn:oasis:names:tc:ubl:SpecialisedDatatypes:1:0-draft-8.3" xmlns:cur="urn:oasis:names:tc:myown:codelist:CurrencyCode:1:0" xmlns:ccts="urn:oasis:names:tc:ubl:CoreComponentParameters:1:0-draft-8.3" xmlns="urn:oasis:names:tc:ubl:SpecialisedDatatypes:1:0-draft-8.3" xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified" version="1:0-draft-8.3">
    <xsd:import namespace="urn:oasis:names:tc:ubl:CoreComponentParameters:1:0-draft-8.3" schemaLocation="UBL-CoreComponentParameters-1.0-draft-8.3.xsd"/>
    <xsd:import namespace="urn:oasis:names:tc:myown:codelist:CurrencyCode:1:0" schemaLocation="../codelist/use/Customised-NumericCodeList-CurrencyCode-1.0.xsd"/>
    <xsd:complexType name="CurrencyCodeType">
        <xsd:simpleContent>
            <xsd:extension base="cur:PossiblyDifferentNameForCurrencyCodeType"/>
        </xsd:simpleContent>
    </xsd:complexType>
</xsd:schema>
( where we change namespace/location and the name of the base type).

This has some obvious architectural elegance, but is it actually solving a real problem?

Firstly, because we have changed the combined SDT schema definitions, it now needs a new version.  This means changes in each affected document schema.  Our Invoice schema now has to import a different combined SDT schema.

So what have we gained.  Instead of changing the namespace for the Code List schema we change the namespace for the combined SDT schema (as well as changing the combined SDT schema itself).  

It appears that, with or without the combined SDT schema, we end up changing the Invoice document schema whenever we change the code lists applying to any of its codes.

But, as i noted above, this is a good thing.  Because after these chnages, it is no longer the UBL 1.0 Invoice schema.  Any instances will need to use different values to be validated.

Another  side issue with this idea of using a combined SDT schema, is what of implementations that want to use their own code lists (the "placebo" ones).  I presume we would not want them to add to the UBL combined SDT schema.  So do they create their own combined SDT schema?  Then we get sets of these and so on, and so on...

I keep coming back to the idea of making this simple - a code has a qualified code data type that maps onto a schema that looks like the CLSC schema (less the substitution group - for now).  We collectively refer to all these code list schemas as 'specialised data types' and everyone is happy :-)

This discussion reminds me of my grandfather leaving his wood offcuts - just in case he might need them later.  My shed is full of old bits of wood - maybe I can give some to stephen.

Stephen Green wrote:
Specialised DataTypes Schema Module
 
The next Co-ordination meeting will be preceded by a meeting to discuss
the content of the Specialised DataTypes Schema Module. In particular
Tim has suggested that, since it does not seem to contain anything not
found already in other Schema modules, it may be that we can do without it.
 
In preparation for this discussion I have built a set of Schemas, as we have
in draft 8.3 but without the SDT Schema. The only document schema included
in this is the invoice schema. An invoice instance was produced too.
 
The changes necessary were as follows:
 
1. The namespaces for the codelist schema modules had to be added to both
the document schema modules (just the invoice in this example) and to the
Common Aggregate Components Schema Module, along with the schema locations.
 
2.  References to Codes in these, where the code has a codelist Schema Module in UBL,
(but, importantly, *not* where it doesn't) have to be changed from
 
                       'type="sdt:CurrencyCodeType"'
to, say,            'type="cur:DerivedCodeType"'.
 
(I did not attempt to amend the use of the name 'DerivedCodeType' since I wished to
compare the results as closely as possible with draft-8.3.)
 
The sample invoice (a maximal elements and attributescontent sample, generated with
XML Spy) was valid both against the original schemas and against the new ones since,
although, ideally the namespaces should change (sdt removed and cur added), actually
the invoice is valid (using XML Spy - XSD spec and other parsers ??) without the namespace
change since the namespaces of the codes' types are effectively hidden in the instances.
 
 
This then seems to support the case for successful removal of the SDT Schema module.
 
However, a major concern would be:
 
1. What happens if UBL or other groups wish to add new codelist schema modules
where, at present, either UDT is used for the code's type or the code is new to UBL
altogether. Such a change would appear to not break backwards compatibility with
the SDT Schema Module in place, as at present (or with the substitutionGroup design),
but would this still be so with the SDT removed?
 
Such a change would be encouraged if substitutionGroups were introduced for 1.1 say.
Would this removal of the SDT prevent the later use of substitutionGroups in terms
of the need to preserve backwards compatibilty?
 
2.  Does backwards compatibilty only apply to instances? Does not in some ways
apply to schemas even in cases where instances can be unaffected? Is the removal
of the SDT Schema Module going to adversely affect backwards compatibility when
a new codelist needs to be added or one which was based on UDT is change to having
a new Codelist Schema Module as the base of its type? After all, to implement the
facilities offered by substitutionGroup / abstract element Schema architecture one might
have to create a codelist Schema module where previously there was only the UDT.
 
In answer:
 
Adding a codelist schema module that didn't exist before, or requiring that a new
namespace be introduced to the Document Schema Modules and the Common Aggregate
Schema Modules does not necessarily mean that these namespaces have to be changed
in the instances. Though one might wish that it did, it might have negative ramifications
on the backwards compatibility.
 
Adding a codelist means adding a new namespace and a new prefix to the SDT at present
but not necessarily elsewhere.
 
Without the SDT, the namespace prefix has to be added to the type on which a
Code element is based. So the namespace and prefix have to be added to the CAC and,
where appropriate, the Document Schema Modules.
 
They do not have to be added to the instance (to my knowledge), but they could be.
I do not think that adding them would necessarily cause instance problems, though I
wouldn't be very surprised if it did in some situations such as with XPaths and XSLT
Stylesheets as well as some applications. I'd really want to check it with he experts
- and do we have time to do so?
 
Even if there were no instance problems that we could foresee, there is still the need
to go updating namespaces and prefixes in Schema Modules which appeared to be immune
before when adding or, in some cases, changing Codelist Modules.
 
Conclusions
 
I would prefer, in the light of Jon's recent statement "...taking care to construct 1.0 in a
way that will allow the adoption of substitution groups in 1.1 without breaking 1.0 instances",
that we *not* remove the SDT Schema Module at this stage without further expert assurance
that it will not cause foreseeable problems with 1.1
 
I think it may be worth getting extra advice regarding the effect of changing a codelist schema
namespace on an invoice with regards to backwards compatibilty too. There is no adverse affect in
XML Spy and StylusStudio but how about other parsers and XSLT stylesheets? Have we any comeback
about this from LMI or Ken? The question is - changing a namespace in a Schema which is not directly
referenced by an instance - does it ever cause problems for such instances in a way that some
would view as meaning that such changes break backwards compatibilty?
 
One way round this, if the SDT were removed (and it might not hurt even if it weren't), might be
to create schema modules of all our codelists so that we don't get problems adding these later.
This doesn't seem ideal though (we did it for beta but it meant a large set of schemas and
greater complexity and maintenance). I know I sought to assure that this wouldn't be necessary
when considering adding substitutionGroups, etc for 1.1 but without the SDT I wouldn't be so sure.
 
 
 
Stephen Green
 
 
 
 
 
 
 

To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/ubl/members/leave_workgroup.php.

-- 
regards
tim mcgrath
phone: +618 93352228  
postal: po box 1289   fremantle    western australia 6160



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]