[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ubl] Re: Namespace URI string implications
At 2006-06-09 13:47 -0700, jon.bosak@sun.com wrote: >I've just reviewed the messages in this thread, and I have a >couple of personal observations to make in advance of Ken's >promised document on the subject. (finally posted) >First, I think that it's important to recall the meaning of "minor >version." I'm deliberately stating this from memory in order to >expose the state of my understanding: A minor version is one >against whose schemas instances conforming to the previous major >version will continue to validate. So, for example, a UBL version >(let's call it 2.1 for purposes of discussion) is a minor version >if valid UBL 2.0 instances will continue to validate against new >UBL 2.1 schemas. Implicit in this definition is the idea that >minor versions can only contain additions to the previous major >version; they cannot eliminate any information items or make >mandatory any information items that were previously declared to >be optional (though they can make optional items that were >previously mandatory). In other words, every minor version schema >is a superset of the previous major version of that schema. Everything up to here is totally acceptable to me. >The >point of minor versioning is to allow updates to the schemas >without requiring implementors of the previous major version to >revise all their software. But I have a problem with the above ... I believe they *do* have to revise all their validation processes and software if there are additions to the document models and we haven't accommodated that in our NDRs for the major version. I have a discussion of this in Section 9.3 of my discussion paper. In a traditional XML-based system, the instance is validated and the program reads the valid instance into known structures and acts on the information in those structures. A UBL 2.0 system, therefore, has 2.0 schemas for validation and 2.0 structures built into their applications. Pass to this existing UBL 2.0 system a new 2.1 instance with 2.1 structures added to the document ... say a new sibling element at the end of a bunch of children of an existing element: the validation fails because it doesn't know of the new child element, and the application fails because it has too much information with which to load its data structures. I've proposed we consider that the NDRs trigger the schema generation of <xsd:any> with ##any as a last child of all elements with element children. This needs experimentation, but with this in place in UBL 2.0, a validating process and an application built for 2.0 will not choke on the presence of a UBL 2.1 construct in the instance. Heterogeneous network validation upgrades can proceed in a piecemeal fashion, where only the upgraded nodes validate the new constructs but the as-yet-to-be-upgraded nodes don't choke on the new constructs. >So the first thing I'd like to observe is that if the appearance >of a 2.0 namespace URI will prevent a 2.0 document instance from >validating in an environment expecting a 2.1 document, then there >can be no such thing as a minor version as we have defined it. I don't think there is a barrier in the above ... since the namespace hasn't changed, and the new 2.1 information items are optional, a UBL 2.0 instance should validate without problems with a UBL 2.1 model. The problem is in the other direction ... a heterogeneous network of validation processes or a heterogeneous set of applications cannot be wholly upgraded en-masse by being taken down to change everything before being brought back up again with explicit UBL 2.1 support. >This doesn't just apply to UBL but to every XML vocabulary. So >either (a) we and every other XML effort are going to have to >abandon the concept of minor versioning, or (b) the factor that's >preventing the 2.0 document from validating in a 2.1 environment >is a bug in the way that namespaces are implemented, and we're >going to have to figure out a workaround for it. I believe the problem is not in that direction, but the other way around. It is the established installed base that breaks in the presence of 2.1, not a new 2.1 installation breaking in the presence of 2.0. >The second thing I'd like to say is that I personally believe the >notion of blind interchange to be unrealistic. I simply cannot >imagine a real-world business accepting either a purchase order or >an invoice without some prior out-of-band agreement (even if it's >only a handshake or a phone conversation). I've been talking about blind interchange *at the program level*, not at the business level. I agree ... I'm not going to send an invoice following the (imaginary) Canadian Subset extensions to the Danish implementation of the North European Subset and expect to be paid just because I've sent the instance. We will have agreed that I am allowed to send the instance, their systems will be prepared to accept the instance, and they would have documented their NES and country-specific specializations of the NES. The serendipity happens when that application of theirs can accept my instance without having made any changes to the software because it accepts the common bits of UBL-standard information items and accepts that as sufficient without the absolute requirement for the extra levels of detail and information found in extensions. The serendipity I've been thinking about isn't the blind "I hope the system processes this invoice without knowing about me", but the "lucky the system was defined this way that when I send my invoice they are expecting from me that they don't have to change their software and I don't have to change my instance (or my software that created my instance)." >Common B2C portals >like amazon.com are not examples of blind interchange, because >they enforce the input format through generation of the portal >input forms, and they rely upon payment agreements that are far >from ad hoc. If anyone can think of a real-world example of the >unconstrained blind interchange of a legally binding business >document, I'd like to hear it. This seems somehow to have become >a requirement, but I'm not sure whose it is. I've been talking about the implementation being blind to the extensions in a received instance it knows nothing about and being able to process an instance without change ... not the blind business aspects of "slipping in an invoice undetected." >Being kind of a simple-minded guy, therefore, I conceive this >issue in terms of the following scenario and its two basic forms. > >Scenario: Company A has implemented 2.1 in software, while company >B is still at 2.0. A and B have thought this through together and >have decided that A can do without 2.1 items in 2.0 instances from >B and B can ignore the added items in 2.1, thus enabling B to >avoid a software upgrade. But B cannot ignore the added items in 2.1 if we don't make special accommodations in our NDRs that were not present in the January 2006 beta release of 2.0 ... and implementations that have pre-compiled structures based on the January 2006 2.0 schemas will reject the loading of XML instances that have unexpectedly-structured content. Without the accommodations, validation fails and programs stop working. If we experiment with 2.0 having <xsd:any> with ##any as the last child of every element with element children, then 2.0 schemas will accommodate 2.1 elements and 2.0 pre-compiled structures should be able to ignore the 2.1 elements without error. And I have proposed in my paper that the new 2.1 information items have their own 2.1 namespace URI string ... we can then detect new constructs by their namespace, and users of UBL will be able to track down the definition of new constructs by knowing in which release they have been added. Every instance will have clearly demarcated in each information item from where that information item is defined. >Situation 1: B sends A a 2.0 document. > > Solution for Situation 1: A's input filter peeks at B's > document and changes the namespace to 2.1 before processing in > order to fool its 2.1 software into handling it. (By our > definition of minor version, a valid 2.0 document is, except > for the namespace declaration, also a valid 2.1 document.) We > can characterize this as an XSLT solution if we want, but the > fact is that it could be done with sed or perl or even by hand. None of this should be necessary ... a 2.1 application already validates and accommodates a 2.0 instance. > Note that we already considered this approach when discussing > customization two years ago in Hong Kong; from my notes of that > session (published to the TC that week as > chair-opinion-20040513.pdf): > > Use case 2 > > - An XYZ industry profile is developed by defining XYZ > schemas that are proper subsets of the UBL 1.0 > schemas. The definition of gproper subseth is > that any valid XYZ instance is also a valid UBL 1.0 > instance. Candidate users of UBL indicate they also need to be able to add information items ... hence our creation of the extension point. > - Action for UBL TC: Because the XYZ instances will carry a > non-UBL namespace, we need to (or should) develop a > simple technique whereby XYZ instances can be made to > look to off-the-shelf UBL 1.0 applications like UBL 1.0 > instances. Perhaps this could take the form of a > configuration file recommended for inclusion in every > conformant UBL 1.0 processor that will allow it to > recognize that the XYZ namespace is in fact a subset of > the UBL 1.0 namespace and substitute the UBL 1.0 > namespace for the XYZ namespace as the first step in > instance processing. I don't think namespace substitution is the way to go ... we lose identity of the constructs. I do think transformation is critical for subsets, because subsets can totally remove an optional element and the transformation is needed to remove the valid presence of that optional UBL element before the subset application, tuned not to receive the optional element, gets the instance. > Note also that any scenario in which A's input filter can peek > at the namespace URI before validating it is a scenario in > which it can peek at a version attribute or element before > validating it. So I don't see why the version info has to be > in the namespace URI. I'm not sure about the "peeking" business ... because the information items are labelled with namespaces, A is going to have to know all possible ways of identifying the element if it is going to do conditional processing. >Situation 2: A sends B a 2.1 document. > > Solution for Situation 2: An XSLT filter (or perl script or > whatever) at B strips out the information items not in 2.0 > (thus changing it into something indistinguishable from a 2.0 > instance) *and* changes the namespace URI back to 2.0 so that > B's software can process it. This is presumably something like > what Ken is going to propose to us. Yes, but since I have proposed that only new UBL 2.1 information items have the new 2.1 namespace URI, and any existing UBL 2.0 information item retain its old 2.0 namespace URI, only a transformation is going on of removing unexpected constructs ... no transliteration between namespaces. In your scenario, "B" has to know the existence of 2.1 before it can accommodate 2.1 by changing it to 2.0 ... in my scenario, "B" does not have to know the existence of 2.1 because it can automatically transform an instance of 2.1 into an instance of 2.0 because it is preserving the 2.0 constructs and eliding other stuff it doesn't recognize. The B user doesn't have to change anything until it wants access to 2.1 constructs ... the installed 2.0 automatically accommodates 2.1, 2.2, ad infinitum until it wants to. I'm trying to think here of installations and existing software and systems ... I'll let the business rules of running business determine when software gets the instances, I just want the software to work with new instances without any changes, or upgrades, or modified transformation filters. As described in my paper. >If so, I'd like to > recommend that the appropriate XSLT filter be made part of each > minor version release. Even so, it would require installation with every minor release ... I don't think we can expect users to accept that. I think installations would be interested in a software system design that only has to be changed when you want access to new information and does not need to be changed in any way if it is already doing what the user wants. >Note again that if you can believe in > an input process that can peep the namespace URI, you can > believe that it can just as easily (or darn near as easily) > peep a version attribute or element. So as before, I don't see > why the version info has to be in the namespace URI. I think it would help to have the version info in the namespace URI at the divination of a new UBL information item, but not change the namespace URI of any existing UBL information items. >The only way I can imagine Situation 2 working in a blind >interchange environment is if B, upon receiving a 2.1 instance >from a previously unknown potential partner A, responds with a >message to the effect that information items beyond those >specified in 2.0 will be ignored, continue anyway? -- something >like what you get when you open a current word processor document >in an old version of the software. But again, I find it hard to >imagine this working effectively in real life. The business side can determine what makes sense to send back and forth ... on the technical side, I believe my approach is resilient to change, forward compatible, and gives implementations the luxury of changing when they want to, not when they have to. I've not seen this approach I've proposed taken before, of a "filter only those things I'm expecting" making it forward compatible and supporting heterogeneous networks of installations, and using namespace URI strings *at the information item level* ... but I'm quite confident it will work. And I think installations will appreciate these features. I'll leave other comments to discussion of the paper I've posted. Thanks, Jon, for sharing your thoughts ... please continue this thread if you have specific questions on my comments above, or phrase your concerns in terms of the sections of the paper. . . . . . . . . . . . . . Ken p.s. I'm rushing to write this ... please excuse any obvious typos. -- Registration open for XSLT/XSL-FO training: Wash.,DC 2006-06-12/16 Also for XSL-FO/XSLT/XML training: Birmingham, UK 2006-07-04/13 Also for XSL-FO/XSLT training: Minneapolis, MN 2006-07-31/08-04 Also for XML/XSLT/XSL-FO/UBL training: Varo,Denmark 06-09-25/10-06 World-wide corporate, govt. & user group UBL, XSL, & XML training. G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/o/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995) Male Cancer Awareness Aug'05 http://www.CraneSoftwrights.com/o/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]