[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ubl-dev] A personal perspective on considerations for UBL subsets, extensions, versions, validation and interchange
Good morning, Fraser, thank you for your feedback. At 2006-06-19 11:27 +0100, Fraser Goffin wrote: >I've been meaning to feedback some thoughts about my initial read of >your doc (UBL 2.0 subsets, extensions, versions, validation and >interchange) for a week or so now, but as usual got distracted. >Fortunately, I made a few notes, so without re-reading to see if I've >misunderstood somethig (I'm sure you'll point that out ;-) here they >are :- Thanks! Understandably consideration of this feedback did not make it into last night's version 0.3 ... sorry I didn't announce it last night but it was late on Father's Day and I was anxious to turn the computer off. >Section 3.3 - The 'Serendipity Factor' > >Final para/ final sentance - what does '... without authorization or >intervention to prevent misuse', mean ? That what I mean by "serendipity" and "blind interchange" is not meant to imply that the system design can accept a document out of the blue without authorization or out of the blue without some intervention by the receiver. It was misinterpreted by early readers that I was expecting to design an environment that would respond to unexpected and unsolicited document interchange ... when in fact I was not describing my concept of serendipity from a business perspective, rather, only from a system perspective. So, the system design accommodates an authorized transaction with another system without having to retool, thus making UBL a desirable technology to embrace ... not that an open UBL system is somehow open to abuse because barriers have been removed. >Section 3.4 - A pure-XSD expression of constraints is highly desirable > >>'Independently I've had three people comment to me of the importance of >>equipping programmers with XSD expressions of XML document models >>because these programmers never see the angle brackets of XML....' > >I also find it desirable to have a pure XSD expression of constraints >although not really because of this reason. Interesting! >In fact, I often think >that this requirement is somewhat over-played and may be reflection of >distinction of those who are really implementing data/object centric >services via an XML vaneer of an existing system API, Okay ... I can only go by what I hear from those who give me their admonitions of other technologies because of their pure-W3C-schema interface to the markup world. >and those that >are concerned with document/business process led development. >Of course there are cases for both, but personally I am predominantly in >the later camp and although I recognise that probably because of >historical evolution of XML based services the majority may still be >in the former. I think this will change. I'm pleased to hear this. >I am much more inclined to the view that XML/XSD is a 'first class' >type system and that a preoccupation with mapping to/from other >programmatic type systems is very often un-necessary, inefficient and >sometimes impossible ! Well, that certainly is not what I was told, independently, during my 'round-the-world UBL and standards trip last month. >I don't happen to think that the APIs for >manipulating XML natively are any more difficult than any other, and >in many cases represent the best fit for purpose tooling available. >I'm sure you are aware of the perma-threads that continue to run on >this subject, many concluding that :- > >a. XML <-> Object mapping suffers significant fidelity and impedence >issues (as does XML <-> relational). This typically leads to the need >to only use a compatible subset of XSD types/derivations/content >models. > >b. In turning XML into a class hierarchy, flexibility in the face of >change is degraded often leading to brittle, intrusive and expensive >change management (even for what might in some cases be considered as >a minor (non breaking) changes in XML). Thanks for that summary ... I was not aware of the issues ... >Not saying the justification that you have included is wrong or >mis-leading (clearly neither are true given your later statement about >the 'overwhelming feedback'), I just feel that the case might be made >stronger by recognising that this is not everyone's motivation. But it happens to be the motivation expressed by sufficient numbers to practically forbid a non-W3C-Schema approach. Thanks for taking the time to summarize that. >Section 5.1 - UBL Conformant Instances > >To be clear, is an instance UBL conformant if no constraint violations >occurs when validating against the FULL UBL schema, a subset UBL >schema, or both ? (Remember that the committee hasn't adopted these terms, these are just my proposals). I think "UBL Conformance" is independent of "Subset Conformance" and that it will be up to subset definers to define what they see as conformance to a subset. So, in this clause, I was thinking solely about no constraint violations of the full UBL schema. >Section 5.3 - UBL open systems > >(3) seems in conflict with serendipitous exchange ? Oh, I was hoping that was fully in support of (and equivalent to) serendipitous exchange. I seem unable to convey my thoughts that I'm talking solely about the resilience of a system to accommodate unexpected inputs by acting on the expected content without rejecting the instance because of the unexpected content. Perhaps I should change the wording. I was hoping the term "UBL Open System" would characterize those systems supporting serendipitous exchanges, such that retooling would not be necessary to accommodate new business relationships set up with trading partners already using UBL with systems they've already created for use in other contexts. >Section 7.0 SubSets > >>It should be a stated guideline for subsets >>that any information item that most >>appropriately belongs in the standardized component should go in the >>standardized component and not in the subset extension. > >I like that statement a lot. But I am concerned about the governance model. Indeed ... but I thought it was just common sense. >First, how does UBL keep control of its own standard I don't see a "controlling" role for the UBL technical committee ... since the standard is open for anyone to use (and, indeed, every month we are hearing of existing deployments of UBL of which no-one on the committee was aware) there really is no governance. Only guidance. The only *normative* components of UBL are the schema expressions. The committee can only publish guidelines for use, and with my training I am only advocating approaches to using the normative components. >and ensure (as >far as that is reasonable) that implementers don't abuse the standard >for private 'bastardized' exchange vocabularies using extensibility as >the mechanism for introducing data that UBL either won't sanction or >doesn't include quickly enough (this currently seems to be in part >reliant on NDR and partly on UBL Conformance processes that you >describe later - but is this sufficient, and is it too complicated ? >(in my case this has been the primary reason for the standards body >dis-allowing any extension of the standard even for private data (they >fear over-use of that facility and a consequent deminishment of the >standard) ? I see ... well, I suppose that is a risk ... and indeed we saw that in the HTML world when there were tangents in the early days that were not finally reined in until the W3C brought out CSS to stop the bastardization that was going on by browser manufacturers. But hopefully with the extension point we have engineered extensibility in such a way that UBL can continue to interoperate and when it comes time to do a UBL 3.0 we can look at common uses of extensions to determine what might be best migrated into the body. >Second, how do implementers ensure that where they expose a service >interface that claims conformance to UBL, that if their trading >partners send non standard stuff they can (should ?) detect and reject >? I assume this is covered by the statement that extension MUST be >within the extension 'area' only and a subset can only validly remove >optional information items (thus producing an instance that is valid >to the superset standard). Yes, I was hoping that statement would be followed by subset designers, for the reason that an instance of the subset is still an instance of full UBL. >This makes me think of 2 points :- > >1. Does it matter whether structural validation is performed using a >subset schema (ie. one with optionals of no interest removed) or >against the full UBL schema ? (I guess this is really a question about >how to ensure that a subset schema is a valid instance of the >corresponding UBL schema (particularly given the statements later >about 'transform before validate') ? Indeed I raised this in my revisions in last night's version ... section 8.2.2. "Application handling of an arbitrary UBL instance input" talks about the option of validating the incoming instance against full UBL *before* performing the transform-before-validate process so that the transformation does not mask true non-UBL-conformant instances by making such an instance subset-conformant by deleting the "bad stuff". >2. Is it reasonable to assert (insist ?) that implementers shouldn't >invent their own data types/aggregates if one exists in the standard. As a guideline, yes ... if they "break" that rule they will have less interoperability with open UBL systems, so I'm hoping the common sense reason will be sufficient to implementers. >If they need something with the same semantics and structure in a >private extension I think they should use the standard. But should >this be explicitly declared as UBL (in the UBL namespace) or should >the process be for implementers to 'borrow' from the standard but use >their own namespace (I think this formed part of your reasoning behind >processContents='skip') ? Extensions are allowed to utilize UBL constructs in the expression of the extension semantic (provided the apex of the extension, that being the child of the UBL extension point, is itself not a UBL construct) because the ancestral labeling of whatever is using the UBL constructs indicates a different purpose. I would hope, for example, that an extension definition of a new party would exploit the existing party definition ... indeed it would not make sense (though I admit that isn't a very strong constraint) for a subset definer to define a new party construct when all they need is a new party parent and use the existing party construct. >Section 7.1 - The choice of XSD for schema expression > >2nd from last para :- > >>'... a transformation that removes the information items not desirable to the >>subset,..' > >Granted, but it might be useful to include something that picks up >David Orchard's distinction of 'Must Ignore Unknown' approaches, >specifically whether 'retain' or 'discard' is used. It is possible >that information items that arrive in an inbound message are part of >the required output, even when they are unused by the receivers >business process (i.e. a sender may send and expect to receive a full >UBL instance). Well, I can understand the "send", but if the sender is dealing with a subset user, then I would not expect the sender to expect to receive a full UBL instance since the other party isn't a full UBL system. >Similarly there may be legal, audit or other regulatory >requirements which require that some items are reflected in request >and responses and/or passed through to upstream processes. How would these be described, and wouldn't the requirements be so arbitrary as to be inexpressible in a declarative format? >I have seen >this point made on the newsgroup in regard to whether exchanges are >based on 'caveat emptor' or 'caveat venditor'. > >You might argue that this is simply the process of determining the >subset schemata and filter processing, but it might be worth pointing >out so as readers don't forget ? Could you clarify the point to make? That senders *cannot* expect anything beyond the fence (now described in last night's section 7.2 "The subset fence")? Perhaps my new focus on that subject is sufficient ... please let me know. I hope that I was able to address your concerns here in what I posted last night. >Section 7.2. - Subset UBL Conformance > >>a subset instance must be UBL-conformant > >Also subset schemata presumably ? Yes, that was implied but I see perhaps I should have stated that explicitly. >First para after bullet points: > >>'A subset schema cannot be used for validation directly .... > >Would it be a desirable approach to validate to the FULL UBL schema >BEFORE the filter transform ?. Indeed I had anticipated this question based on more analysis of my data flow diagrams and explicitly call this out in Figure 4. "Subset handling of arbitrary UBL instance input". >Notwithstanding the subset deployment >recommendations in section 8, if implementers didn't go that far for >whatever reason, or they just got part of it wrong (say the filter >processing was 'buggy' - it might give the appearance of valid UBL, >but it is actually making invalid UBL 'valid' by inadvertantly >removing invalid items), wouldn't it be better to separate out UBL >conformance so that :- > >a) a received message can be checked to be fully UBL conformant and if >not rejected as such > >b) the filter processing is based ONLY on valid UBL instances. If the >subset validation fails, the reasons can be clearly distinguished (the >message does not conform to the subset schemata/rules, or the filter >is buggy). Right ... which is why I've added it in ... I realized there was the opportunity for "false positives" because the transform-before-validate transformation could hide UBL inconsistencies. But note that I've still made it optional (though recommended) given that implementers can choose how complex to make their systems. Their choice of implementation level will dictate how complete and interoperable their systems are. This, again, was driven into me during my trip: "make UBL easy to deploy" ... but my rebuttal of "but an easily-deployed system will have drawbacks in interoperability and error detection" was disregarded. So, you will see in effect three levels of implementation from easiest (least overhead) to hardest (most overhead): (1) - Figure 3. "Subset handling of pure subset instance input" (2) - Figure 4. "Subset handling of arbitrary UBL instance input" without full UBL checking (3) - Figure 4. "Subset handling of arbitrary UBL instance input" with full UBL checking >Section 8.2.2 - Application handling of an arbitrary instance input > >Final para/sentance : > >>'Considering section 5.3, ..... > >Doesn't this conflict with the idea that an 'open' UBL system should >be able to operate even with extensions and optional items that it can >process, absent. I'm not sure, I'm having a bit of trouble with this >concept. Are we talking about some form of 'fall back' behaviour ? No, I'm trying to convey that a "UBL Open system" is one that implements transform-before-validate ... thus accepting any instance of UBL because by the time the data reaches the application there is only the data that the application expects and not any data that the application doesn't expect. >Figure 4. > >Previous comment about the desirability of running validation to full >UBL schema before filter transform ? Do you think this is un-necessary >? I think that mandating it would leave the impression the system was too complex, but that recommending it would give implementations an extra level of conformance checking and would prevent the false positives from getting through. Please let it be known if you think I should change those dotted lines into solid lines and mandate the UBL schema checking ... I can certainly live with that, and indeed would prefer it, but I'm trying to be accommodating by making it optional. Perhaps that is too accommodating. >Section 9 - Versions of UBL > >Phew, this is interesting, but I'm still mulling it over. A few things >for now :- > >- given you are proposing processContents='skip' for extensions, I >think it would be useful to a) explicitly identify that for those who >want to validate data in an extension they need to do something extra >(I didn't feel like this was covered by section 8) and, b) perhaps >provide a description of some of the approaches that could be >considered ? Does the revised 8.1.2. "Subset supplemental validation artefact preparation" section now underscore the role of subset business rules? My exemplar is the correlation of detailed line-item-level information in the extension with the summary line-item-level information in standardized UBL. >Section 9.4 - A running example of the proposed version extensibility > >ubl2.xsd - processContents = 'skip' for 'Extension and >FutureVersions' - still think this could/should be 'lax', but I guess >it somewhat depends on whether UBL want to allow implementers to use >UBL namespaced items in an extension ? I remain unconvinced. :{)} I am so worried that using "lax" will trip up false negatives by dictating to subset definers what their subsets should look like, when it should be totally up to the subset definers. >ubl21.xsd :- > - why no Extension within element 'LineItem' ? I supported this (and for party): http://lists.oasis-open.org/archives/ubl/200604/msg00010.html But it was decided to have only one extension point: http://lists.oasis-open.org/archives/ubl/200604/msg00027.html (end of post) And by now I'm convinced there should be only one extension point. > - the LineItem content model is non deterministic isn't it ? (you >have an optional element (u21:CountryOrigin) declared before >'FutureVersions') ?? I believe it would be non-deterministic the other way, but not this way. The particles are easily distinguished if the UBL 2.1 items are before extensions ... except for misspelled UBL 2.1 items that end up falling under the version extension. >Haven't seen any other comments on this doc on this list. Should I be >looking elsewhere ? No, I'm afraid that no-one else has taken the time to comment ... not even in committee. Though I acknowledge it is a very long document to digest, I would not have written it all if I did not think we were at the most very crucial stage of making such decisions before casting our next set of schemas (which are candidates for being final schemas). I do appreciate that you took the time to post something, Fraser, thank you! I hope others will share their thoughts soon based on my post of last night. . . . . . . . . Ken -- Registration open for UBL training: Montréal, Canada 2006-08-07 Also for XSL-FO/XSLT training: Minneapolis, MN 2006-07-31/08-04 Also for UBL/XML/XSLT/XSL-FO training: Varo,Denmark 06-09-25/10-06 World-wide corporate, govt. & user group UBL, XSL, & XML training. G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/u/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995) Male Cancer Awareness Aug'05 http://www.CraneSoftwrights.com/u/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]