ubl message

Subject: Re: SV: [ubl] Two essential items for Atlantic call
From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
To: <ubl@lists.oasis-open.org>
Date: Wed, 12 Jul 2006 09:11:49 -0400
At 2006-07-12 10:26 +0200, Bryan  Rasmussen wrote:
>Hi,
>
> >>   UBLExtensionType:
>
> >I had originally thought it would look something like this:
> >
> >        <xsd:complexType name="UBLExtensionType">
> >               <xsd:sequence>
> >               <xsd:any namespace="##any" processContents="skip"
>minOccurs="1" maxOccurs="unbounded"/>
> >               </xsd:sequence>
> >        </xsd:complexType>
>
>I think there was miscommunication between various parties. Some people
>wanted it with metadata as to the extension reason. I suggested using
>attributes on the element but it was pointed out we couldn't have new
>attributes.

I was unaware of the request for metadata.  I 
have no problems including metadata, but it has 
to be encapsulated to accommodate multiple 
extensions under the one extension point.  In 
fact I can see that the metadata suggested would 
be very helpful as it would allow an application 
to *examine* the extension with known constructs 
without having to know the extension namespace 
used for the private data:  this could be *very* useful.

>Then in the Brussels meeting Stephen and I tried to specify the metadata as
>to the reason of an extension as a seperate element, a sibling of the
>extension. I suppose this was changed giving what you show below:
>
> ><xsd:complexType name="UBLExtensionType">
> >    <xsd:sequence>
> >        <xsd:element ref="cbc:ID" minOccurs="0" maxOccurs="1">
> >        <xsd:element ref="cbc:Name" minOccurs="0" maxOccurs="1">
> >        <xsd:element ref="cbc:AgencyID " minOccurs="0" maxOccurs="1">
> >        <xsd:element ref="cbc:AgencyName" minOccurs="0" maxOccurs="1">
> >        <xsd:element ref="cbc:VersionID" minOccurs="0" maxOccurs="1">
> >        <xsd:element ref="cbc:AgencyURI" minOccurs="0" maxOccurs="1">
> >        <xsd:element ref="cbc:URI" minOccurs="0" maxOccurs="1">
> >        <xsd:element ref="cbc:ExtensionReasonCode" minOccurs="1"
>maxOccurs="1">
> >        <xsd:element ref="cbc:ExtensionReason" minOccurs="0" maxOccurs="1">
> >        <xsd:element ref="cbc:ExtensionContentAny" minOccurs="1"
>maxOccurs="1">
> >    </xsd:sequence>
> ></xsd:complexType>
> >    - One might argue that all of these elements (except for
>'ExtensionContent') are properties of the ''ExtensionContent'
> >element, *not* the 'UBLExtension' element.
>
>Then you would have something like
>
><xsd:complexType name="ExtensionContentAnyType">
>     <xsd:sequence>
>         <xsd:element ref="cbc:ID" minOccurs="0" maxOccurs="1">
>         <xsd:element ref="cbc:Name" minOccurs="0" maxOccurs="1">
>         <xsd:element ref="cbc:AgencyID " minOccurs="0" maxOccurs="1">
>         <xsd:element ref="cbc:AgencyName" minOccurs="0" maxOccurs="1">
>         <xsd:element ref="cbc:VersionID" minOccurs="0" maxOccurs="1">
>         <xsd:element ref="cbc:AgencyURI" minOccurs="0" maxOccurs="1">
>         <xsd:element ref="cbc:URI" minOccurs="0" maxOccurs="1">
>         <xsd:element ref="cbc:ExtensionReasonCode" minOccurs="1"
>maxOccurs="1">
>         <xsd:element ref="cbc:ExtensionReason" minOccurs="0" maxOccurs="1">
>         <xsd:any namespace="##any" processContents="skip" minOccurs="1"
>maxOccurs="unbounded"/>
>     </xsd:sequence>
></xsd:complexType>
>
>Yes, but if they were inside of the ExtensionContentAny then that would be
>non-deterministic.

But wouldn't the non-determinism disappear with encapsulation?

>As an aside I still prefer not to have the the content of ExtensionContentAny
>(or whatever name is chosen for actually holding the extending elements) be
>unbounded.

Yes, you mentioned this earlier.  It is 
unfortunate that you missed the discussion we had 
during the Atlantic call.  On my part it is 
unfortunate that I missed the discussion of extensions in Brussels.

I cited in the call that it would be an expected 
use case (I think) that a single instance will have multiple extensions.

Consider that here at Crane I plan to create a 
Crane extension in which I will have line-item 
detail that supplements the base line item 
summary with information required for Crane's 
legacy invoice printing requirements.  No-one 
else will have any interest in these legacy print 
requirements, so they can ignore the Crane part under the extension point.

But ... if I create an invoice for the NES, then 
I simultaneously need two extensions under the 
extension point:  one for my legacy format and 
one for the NES extensions.  My stylesheets 
ignore the NES extension and the NES applications ignore the Crane extensions.

>I prefer only one child of the extension because I think it offers
>the most flexible way for dealing with that content, as in serialize child of
>ExtensionContentAny as own XML, track it in conjunction with parent document.
>doing that with multiple children can create performance problems for some
>technologies.

Can you elaborate on this?  Because we have 
positioned the extension element as the first 
child of the document element, a serial 
processing application can cache expected 
extension information and ignore unexpected 
extension information before beginning any 
processing of baseline UBL content.  A 
tree-processing application isn't affected since 
it can randomly access only what it is looking 
for and ignore what it doesn't expect.

These were arguments I brought up on the call.

>Not seperating the extending content from the real content also
>forces implementers to worry about generic queries on the xml structure.

I disagree ... since XPath addressing is the 
basis of most querying languages, there is no 
impact by the number of extension children when 
directly addressing the main body.

>Thus
>one cannot use xpath or other techology to get all //com:InvoiceLine  because
>that would also get the lines possibly put somewhere under the Extension.

But the use of "//" is recognized as bad practice 
in XPath and XSLT, except in circumstances where 
leaf-level constructs are being addressed.

Also, since extensions would be in alternate 
namespaces, if the "com:" namespace is a UBL 
namespace, then the "//com:InvoiceLine" would 
only address baseline constructs.  My legacy 
extensions would be in "crane:InvoiceLine" and 
would not be addressed by the lazy use of 
"//".  For the benefit of readers, "//" is 
considered bad practice because it doesn't "stop" 
when it hits elements it addresses:  it keeps 
going down the document tree to each and every 
leaf node of each and every sub-tree of the 
document tree looking for InvoiceLine elements, 
even below any InvoiceLine elements it finds.  It 
is the *biggest* hit on stylesheet performance of 
any XPath addressing construct.

>Obviously implementors can deal with this, but I like to keep development
>possibilities as open as possible for as long as possible.

Again, I'm not sure of your concerns.  And this 
was discussed during the teleconference.  Can you 
elaborate on how multiple extensions would 
"close" development possibilities?  I feel that 
encapsulation and the use of namespaces mitigates any problems.

Back to encapsulation, though ... to get the 
metadata you cite above associated with each 
extension we would need something like:

   <cac:UBLExtensions>
     <cac:UBLExtension>
       <cbc:ID>...
       <cbc:Name>...
       <cbc:AgencyID>...
       <cbc:AgencyName>...
       <cbc:AgencyURI>...
       <cbc:URI>... (I'm not sure what this is for?)
       <cbc:ExtensionReasonCode>... (I'm not sure what this adds)
       <cbc:ExtensionReason>... (I suppose this could be helpful text)
       <crane:LegacyExtension>
         ...legacy invoice stuff...
       </crane:LegacyExtension>
     </cac:UBLExtension>
     <cac:UBLExtension>
       <cbc:ID>...
       <cbc:Name>...
       <cbc:AgencyID>...
       <cbc:AgencyName>...
       <cbc:AgencyURI>...
       <cbc:URI>...
       <cbc:ExtensionReasonCode>...
       <cbc:ExtensionReason>...
       <nes:InvoiceExtension>
         ...legacy invoice stuff...
       </nes:InvoiceExtension>
     </cac:UBLExtension>
   </cac:UBLExtensions>

With the above encapsulation, Bryan, do you think 
we will have any non-determinism issues?

I feel very strongly that multiple extensions 
ought to be allowed ... it is the basis of my 
plans to simultaneously support my legacy 
requirements with anyone else's extension 
requirements in a single UBL instance.  To only 
allow one extension at a time would force me to 
keep *two* copies of each invoice, with all of 
the attendant problems of consistency and 
maintenance that introduces:  I could never 
guarantee the information is consistent.  But, 
with one instance with multiple extensions, then 
the baseline UBL information has context in each 
extension environment without the need for multiple copies of the document.

Thanks!

. . . . . . . . . . Ken

--
Registration open for UBL training:    Montréal, Canada 2006-08-07
Also for XSL-FO/XSLT training:    Minneapolis, MN 2006-07-31/08-04
Also for UBL/XML/XSLT/XSL-FO training: Varo,Denmark 06-09-25/10-06
World-wide corporate, govt. & user group UBL, XSL, & XML training.
G. Ken Holman                 mailto:gkholman@CraneSoftwrights.com
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/o/
Box 266, Kars, Ontario CANADA K0A-2E0    +1(613)489-0999 (F:-0995)
Male Cancer Awareness Aug'05  http://www.CraneSoftwrights.com/o/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal
References:
- SV: [ubl] Two essential items for Atlantic call
  - From: "Bryan Rasmussen" <BRS@itst.dk>