wd-cmsc-cmguidelines-v0.4-04
Matthew Gertner <matthew@acepoint.cz>
Eduardo Gutentag, Sun Microsystems, Inc. <eduardo.gutentag@sun.com>
This is a draft document and is likely to change on a weekly basis.
If you are on the <{xxx}@lists.oasis-open.org> list for committee members, send comments there. If you are not on that list, subscribe to the <{xxx}-comment@lists.oasis-open.org> list and send comments there. To subscribe, send an email message to <{xxx}-comment-request@lists.oasis-open.org> with the word "subscribe" as the body of the message.
For information on whether any patents have been disclosed that may be essential to implementing this specification, and any offers of patent licensing terms, please refer to the Intellectual Property Rights section of the Security Services TC web page (http://www.oasis-open.org/committees/security/).
Copyright © 2003 OASIS Open, Inc. All Rights Reserved.
With the first public release of UBL, version 0p70 [Reference], users can begin to gain experience using the library in their applications for interchange of business data among trading partners. Although the library will be subject to change and extension as it approaches the final version, it already contains important document types informed by the broad experience of members of the UBL Technical committee [Reference], including EDI and XML experts.
One of the most important lesson learned from previous standards is that no business library is sufficient for all purposes. Requirements differ significantly amongst companies and industries, and a customization mechanism is therefore needed in many cases before the document types can be used in real-world applications. A primary motivation for moving from the relatively inflexible EDI formats to a more robust XML approach is the existence of formal mechanisms for performing this customization while retaining maximum interoperability and validation.
As a result of this, it is an UBL expectation that:
Customization will indeed happen,
It will be done by national and industry groups and smaller user communities,
These changes will be driven by real world needs, and
These needs will be expressed as context drivers.
EDI dealt with this issue through a subsetting mechanism that took the UN/EDIFACT standard [Reference???] and subsetted it through industry Implementation Guides, which were then subsetted into trading partners IGs, which were then subsetted into departamental IGs. UBL proposes dealing with it through schema derivation.
Thus UBL starts as generic as possible, with a set of schemas that supply all that's likely to be needed in the 80/20 or core case, which is UBL's target. Then it allows both subsetting and extension according to the needs of the user communities and according to what is permitted in the derivation mechanism it has chosen, namely W3C XML Schema [Reference].
These customizations are based on the eight context drivers identified by ebXML (see below and [Reference to context drivers TC output]). Any given schema component always occupies a location in this eight-space, even if not a single one has been identified (that is, if a given context driver has not been narrowed, it means that it is true for all its possible contextual values). For instance, UBL has an Address type that may have to be modified if the Geopolitical region in which it will be used is Thailand. But has long as this narrowing down of the Geopolitical context has not been done, the Address type applies to all possible values of if, thus occupying the "any" position in this particular axis of the eight-space.
In order for the interoperability and validation mentioned at the beginning of this section to be achieved, care must be taken to adhere to strict guidelines when customizing UBL schemas. Although the UBL TC intends to produce a customization mechanism that can be applied as an automatic process in the future, this phase (known as Phase II, and predicted in [Reference to ebXML Context methodology or UBL itself]) has not been reached. Instead, Phase I, the current phase, offers the guidelines included in this document.
This document aims to describe the procedure for customizing UBL, with three distinct goals.
The first goal is to ensure that UBL users can extend UBL schemas in a manner that:
allows for their particular needs,
can be exchanged with trading partners whose requirements for data content are different but related, and
is UBL compatible.
The second goal is to provide a couple of canonical escape mechanisms for those whose needs extend beyond what the compatibility guidelines can offer. Although the product of these escape mechanisms cannot claim UBL compatibility, at least it can offer a clear description of its relashionship to UBL, a claim that cannot be made by other ad hoc methods.
The third goal is to gather use case data for the future UBL context extension methodology, the automatic mechanism for creating customized UBL schemas that we have referred to as Phase II.
The major output of the UBL TC is encapsulated in a series of UBL Schemas [Reference]. It is assumed that in many cases users will need to customize these schemas for their own use. In accordance with ebXML [Reference] the UBL TC expects this customization to be carried out only in response to contextual needs (see [xxx]) and by the application of any one of the eight identified context drivers and their possible values.
It must be noted that the UBL schemas themselves are the result of a theoretical customization:
Behind every UBL Schema, an Ur-schema exists in which all elements are optional and all types are abstract. As mandated in the XSD specification, abstract types cannot be used as written; they can only be used as a starting point for deriving new, concrete types. Ur-types are modelled as abstract types since they are designed for derivation. Whether the UBL TC actually produces and publishes a copy of these Ur-schemas is irrelevant, since it is possible for any one to derive an Ur-schema deterministically from any of the schemas produced by the UBL TC.
The first set of derivations from the abstract Ur-types is the UBL Schema Library itself, which is assumed to be usable in 80% of business cases. These derivations contain additional restrictions to reduce ambiguity and provide a minimum set of requirements to enable interoperable trading of data by the application of one context, Business Process. The UBL schema may then be used by specific industry organizations to create their own customized schemas. When the Schema is used, conformance with UBL may be claimed. When a Schema that has been customized through the UBL sanctioned derivation processs is used, conformance with UBL may also be claimed.
It is assumed that in many cases specific businesses will use customized UBL schemas. These customized schemas contain derivations of the UBL types, created through additional restrictions and/or extensions to fit more precisely the requirements of a given class of UBL users. The customized UBL Schemas may then be used by specific organizations within an industry to create their own customized schemas.
Due to the extensiblilty of W3C Schema, this process can be applied over and over to refine a set of schemas more and more precisely, depending on the needs of specific data flows.
In other words, there is no theoretical limit to how many times a Schema can be derived, leading to the possible equivalent of infinite recursion. In order to avoid this, the Rule of Once-per-Context has been developed, as presented later, in "Context Chains "
Central to the customization approach used by UBL is the notion of schema derivation. This is based on object-oriented principles, the most important of which are inheritance and polymorphism. The importance of the latter can be gleaned from its linguistic origin: poly, meaning "many", and morph, meaning "shape". By adhering to these principles, document formats with different "shapes" can be used interchangeably.
The UBL Naming and Design Rules Subcommittee (NDRSC[ Reference]) has decided to use XSD, the standard XML schema language produced by the World Wide Web Consortium (W3C [Reference]), to model document formats. One of the most significant advances of XSD over previous XML document description languages like DTDs is that it has built-in mechanisms for handling inheritance and polymorphism, which we will refer to as "XSD derivation". It therefore fits well with the real-world requirements for business data interchange and our goal of interoperability and validation.
There are two important types of modification that XSD derivation does not allow. The first can be summarized as the deletion of required components (that is, the reduction of a component's cardinality from x..y to 0..y). The second is the ad hoc location of an addition to the content model through extension. There may be some cases where the user needs a different location for the addition, but XSD extension only allows addition at the end of a sequence.
Thus, there are three different scenarios covering the derivation of new types from existing ones:
Compatible UBL Customization
An existing UBL type can be modified to fit the requirements of the customization through XSD derivation. These modifications can include extension (adding new information to an existing type), and/or refinement (restricting the set of information allowed to a subset of that permitted by the existing type).
Non-compatible UBL Customization
An existing UBL type could be modified to fit the requirements of the customization, but the changes needed go beyond those allowed by XSD derivation.
No existing UBL type is found that can be used as the basis for the new type. Nevertheless, the base library of core components that underlies UBL can be used to build up the new type so as to ensure that interoperability is at least possible at the core component level.
These Guidelines will deal with each of the above scenarios, but we will first and foremost concentrate on the first, as it is the only one that can produce UBL-compatible schemas.
XSD derivation allows for type extension and restriction. These are the only means by which one can customize UBL schemas and claim UBL compatibility. Any other possible mean, even if allowed by XSD itself, is not allowed by UBL.
[EG: For the purposes of the examples in this section, a copy of BuyerPartyType should be presented here]
XSD extension is used when additional information must be added to an existing UBL type. For example, a company might use a special identification code in relation to trading partners. This code should be included in addition to the standard information used in a BuyerParty description (AccountCode, PartyName, Address, etc.) when purchasing goods. This can be achieved by creating a new type that references the existing type and adds new the information:
<xsd:complexType name="MyBuyerPartyType"> <xsd:extension base="cat:BuyerPartyType"> <xsd:element name="InternalSupplierCode" type="xsd:string"/> </xsd:element> </xsd:extension> </xsd:complexType>
Some observations:
Notice that derivation can be applied only to types and not to elements that use those types. This is not a problem; UBL uses explicit type definitions for all elements, in fact disallowing XSD use of anonymous types that define a content model directly inside an element declaration. [See note from Dan regarding anonymous types, mail from 04/09/03:10:20AM]
This derived type MyBuyerPartyType can be used anywhere the original BuyerPartyType is allowed. The instance document should use the xsi:type attribute to indicate that a derived type is being used [EG: This sentence may have to change if we adopt the use of adapter schemas]. This does not enforce the use of the new type inside a given element, however, so an Order schema could still be created using the standard UBL BuyerParty type. If the user wishes to require the use of the derived type, a new derived type must be created from the Order type using refinement and specifying that the MyBuyerPartyType is used. [EG: Dan says this is confusing; I'm not sure how to fix it]
UBL defines global elements for all types, and these elements, rather than the types themselves, are used in aggregate element declarations. The same procedure be used for derived types, so a global MyBuyerParty element should be created based on the MyBuyerPartyType.
All derived types should be created in a separate namespace (which might be tied to the user organization) and reference the UBL namespaces as appropriate. [Appropriate reference to UBL's namespace usage, and perhaps reference to a section in this document]
XSD restriction is used when information in an existing UBL type must be constrained or taken away. For instance, the UBL BuyerPartyType permits the inclusion of any number of addresses or none. If a specific organization wishes to allow exactly one address, this is achieved as follows (note that the annotation fields are removed from the type definition to make the example more readable):
<xsd:complexType name="MyBuyerPartyType"> <xsd:restriction base="cat:BuyerPartyType"> <xsd:sequence> <xsd:element ref="ID" id="UBL000090"> </xsd:element> <xsd:element ref="AccountCode" id="UBL000091" minOccurs="0"> </xsd:element> <xsd:element ref="PartyName" id="UBL000092" minOccurs="0" maxOccurs="unbounded"> </xsd:element> <xsd:element ref="Address" id="UBL000093" minOccurs="1" maxOccurs="1"> </xsd:element> <xsd:element ref="PartyTaxScheme" id="UBL000094" minOccurs="0" maxOccurs="unbounded"> </xsd:element> <xsd:element ref="BuyerContact" id="UBL000095" minOccurs="0"> </xsd:element> </xsd:sequence> </xsd:restriction> </xsd:complexType>
Note that the entire content model of the base type, with the appropriate changes, must be repeated when performing restriction.
A very important characteristic of XSD restriction is that it can only work within the limits imposed by the rule that says that the resulting type must still be valid in terms of the original type, that is, it must be a true subset of the original such that a document that validates against the original can also validate against the changed one. Thus:
you can reduce the number of repetitions of an element (that is, change its cardinality from 1..100 to 1..50, for instance)
you can eliminate an optional element (that is, change its cardinality from 0..3 to 0..0)
you cannot eliminate a required element or make it optional (that is, change its cardinality from 1..3 to 0..3)
Every time a derivation is performed on a UBL- or UBL-derived Schema, the context driver and the driver value used must be documented. If this is not done, then by definition the derived Schema is not UBL-compliant.
Context is expressed using a set of name/value pairs (context driver, driver value), where the names are one of a limited set of context drivers established by the UBL TC:
Business process
Official constraints
Product classification
Business process role
Industry classification
Supporting role
Geopolitical
System capabilities
There is no pre-set list of values for each driver. See [Reference to the Context Driver SC or some ebXML document that talks about this.]
There is no predetermined order in which context drivers are applied.
More than one context driver might be applied to various types within the same set of schema extensions. Therefore, documentation at the root level is not enough. Context should be included as an element Context (in the UBL namespace) inside the documentation for each customized type, with the name of the context derived expressed as in the list above, but using capitalized camel case. The Context element has two attributes, driver and value. For example, if the type is to be used in the French automobile industry, the Context documentation would appear as follows:
<xsd:annotation> <xsd:documentation> <ubl:Context driver="IndustryClassification" value="Automotive"/> <ubl:Context driver="Geopolitical" value="France"/> </xsd:documentation> <xsd:annotation>
If a customization is made that does not fit into any of the existing context drivers, it should be described in prose inside the Context element:
<xsd:annotation> <xsd:documentation> <ubl:Context>Used for jobs performed on weekends to specify additional data required by the trade union</ubl:Context> </xsd:documentation> <xsd:annotation>
Any issues with the set of context drivers currently defined or the taxonomies to be used for specifying values should be communicated to the UBL Context Driver Subcommittee.
As mentioned in "Customization of Customization", there is a risk that derivations may form extremely long and unmanageable chains. In order to avoid this problem, the Rule of Once-per-Context was formulated: no context can be applied, at a given hierarchical level of that context, more than once in a chain of derivations. Thus, if the Geopolitical context driver with a value of "USA" has been applied to a type, it is possible to apply it again with a value that is a subset, or that occupies a hierarchically lower level than that of the original value, like California or New York, but it cannot be applied with a value equal or higher in the hierarchy, like Japan. In order to use that latter value, one must go up the ladder of the customization chain and derive the type from the same location as that from which the original was derived.
There are two important types of customization that XSD derivation does not allow. The first can be summarized as the deletion of required components (that is, the reduction of a component's cardinality from x..y to 0..y). The second is the ad hoc location of an addition to the content model through extension. There may be some cases where the user needs a different location for the addition, but XSD extension only allows addition at the end of a sequence.
Because XSD derivation does not allow these types of customization, any attempts at enabling them must by necessity produce results that are not UBL compatible. However, in order to allow users to customize their schemas in a UBL-friendly manner, the notion of an Ur-schema was invented: for each UBL Schema, an Ur-schema exists in which all elements are optional and all types are abstract. The use of abstract types is necessary because Ur-types can never be used as is; a derived type must be created, as per the definition of abstract types in the XSD specification.
XSD derivation is sufficient for most cases, but in some instances it might be necessary to perform changes to the UBL types that are not handled by standard mechanisms. In this case, the UBL Ur-types should be used. Remember, an Ur-ype exists for each UBL standard type and differs only in that all elements in the content model are optional, including elements that are required in the standard type. By using the Ur-type, the user can therefore make modifications, such as eliminating a required field, that would not be possible using XSD derivation on the standard type.
For instance, suppose an organization would like to use the UBL BuyerPartyType, but does not want to use the required ID element. In this case, normal XSD refinement is used, but on the ur type rather than the standard type:
<xsd:complexType name="MyBuyerPartyType"> <xsd:restriction base="ur:BuyerPartyType"> <xsd:sequence> <xsd:element ref="ID" id="UBL000090" minOccurs="0" maxOccurs="0"> </xsd:element> <xsd:element ref="AccountCode" id="UBL000091" minOccurs="0"> </xsd:element> <xsd:element ref="PartyName" id="UBL000092" minOccurs="0" maxOccurs="unbounded"> </xsd:element> <xsd:element ref="Address" id="UBL000093" minOccurs="0" maxOccurs="unbounded"> </xsd:element> <xsd:element ref="PartyTaxScheme" id="UBL000094" minOccurs="0" maxOccurs="unbounded"> </xsd:element> <xsd:element ref="BuyerContact" id="UBL000095" minOccurs="0"> </xsd:element> </xsd:sequence> </xsd:restriction> </xsd:complexType>
The new type is no longer compatible with the UBL BuyerPartyType, so standard processing engines that know about XSD derivation will not recognize the type relationship. However, some level of interoperability is still preserved, since both UBL BuyerPartyType and MyBuyerPartyType are derived from the BuyerPartyType Ur-type. If this additional flexibility is required, a processor can be implemented to use the Ur-type rather than the UBL type. It will then be able to process both the UBL type and the custom type, since they have a common ancestor in the Ur-type.
Once again, changes to the Ur-type do not enforce changes in the enclosing type, so the UBL OrderType has to be changed as well if the user organization wants to ensure that only the new MyBuyerPartyType is used. In fact, the new OrderType will not be compatible with the UBL OrderType, since MyBuyerPartyType is no longer derived from UBL BuyerPartyType. However, the new OrderType can be derived from the OrderType Ur-type to achieve maximum interoperability.
Sometimes no type can be found in the UBL library or ur type library that can be used as the basis of a new type. In this case, we should still strive for maximum interoperability by building up the new type using types from the core component library that underlies UBL.
For example, suppose a user organization needs to include a specialized product description inside business documents. This description includes a unique ID, a name and the storage capacity of the product expressed as an amount. The type definition should appear as follows:
<xsd:complexType name="ProductDescriptionType"> <xsd:sequence> <xsd:element name="ID" type="cct:IdentifierType"/> <xsd:element name="Name" type="cct:TypeType"/> <xsd:element name="Capacity" type="cct:AmountType"/> </xsd:sequence> </xsd:complexType>
It goes without saying that all new names defined when creating custom types from scratch should also conform to the UBL Naming and Design Rules.
It is planned that a context extension methodology will be designed to enable automatic customization of UBL types for specific purposes. This methodology works by using a formal specification of the reasons for customizing the type, known as the context. By expressing the context formally and specifying rules for adapting types based on context, most of the changes that need to be made to UBL in order for it to fit in a given usage environment can be generated by the context engine rather than performed manually. In addition, significant new flexibility can be gained, since rules from two complementary contexts can be applied simultaneously, yielding types appropriate for, say, the automobile industry and the French geopolitical entity.
UBL has not yet progressed to this stage of development. For now, one of the main goals of the UBL Context Methodology Subcommittee is to gather as many use cases as possible to determined what types of customizations are performed in the real world, and on what basis. Another important goal is to ensure that types derived at this point from UBL's version 1 can be still used in the future, intermixed with types derived automatically in the future.
The following individuals were members of the committee during the formulation of this document:
Copyright © The Organization for the Advancement of Structured Information Standards [OASIS] 2001, 2002. All Rights Reserved.
OASIS takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on OASIS's procedures with respect to rights in OASIS specifications can be found at the OASIS website. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification, can be obtained from the OASIS Executive Director.
OASIS invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to implement this specification. Please address the information to the OASIS Executive Director.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to OASIS, except as needed for the purpose of developing OASIS specifications, in which case the procedures for copyrights defined in the OASIS Intellectual Property Rights document must be followed, or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by OASIS or its successors or assigns.
This document and the information contained herein is provided on an "AS IS" basis and OASIS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
OASIS has been notified of intellectual property rights claimed in regard to some or all of the contents of this specification. For more information consult the online list of claimed rights.
For information on whether any patents have been disclosed that may be essential to implementing this specification, and any offers of patent licensing terms, please refer to the Intellectual Property Rights section of the {technical-committee} web page (http://www.oasis-open.org/committees/{technical-committee})
[RFC 2119] S. Bradner. RFC 2119: Key words for use in RFCs to Indicate Requirement Levels. IETF (Internet Engineering Task Force). 1997.