ubl-dev message

Subject: Re: [ubl-dev] Re: Code list extensibility and substitution groups
From: "William J. Kammerer" <wkammerer@novannet.com>
To: <ubl-dev@lists.oasis-open.org>
Date: Mon, 21 Feb 2005 21:47:13 -0500
I'm guessing that substitutionGroups mean "any extensions to the code
lists themselves cannot change in structure, only the enumerated sets
themselves can change," as the substituted element has either the same
type as the "head" abstract element - or one which can be derived from
it. I think that's the advantage of substitutionGroups over redefine;
there'd probably be nothing keeping you from changing the structure with
a redefine.

But in order for UBL to provide the (future) capability of "override,"
all the schemas for off-the-shelf code lists will probably have to be
modified to accommodate any possible future abstraction (kind of like
C++ virtual functions). I guess that's why the Code List group has to
make a decision now; and they won't know whether it's worth making these
changes unless someone can demonstrate how this substitutionGroup stuff
can be used.

William J. Kammerer
Novannet
Columbus, OH 43221-3859 . USA
+1 (614) 487-0320

----- Original Message ----- 
From: "Duane Nickull" <dnickull@adobe.com>
To: <jon.bosak@sun.com>
Cc: <ubl-dev@lists.oasis-open.org>
Sent: Monday, 21 February, 2005 07:36 PM
Subject: Re: [ubl-dev] Re: Code list extensibility and substitution
groups


Jon:

Apologies - several of us couldn't resist taking a shot at CAM.  You are
right and we should follow ocCAM's Razor - "one should not increase,
beyond what is necessary, the number of entities required to explain
anything".  Seems fitting, doesn't it ;-)
http://pespmc1.vub.ac.be/OCCAMRAZ.html

The code list issue is a serious one and I do have one question about
determinism in this context.  Does this primarily refer to the fact that
any extensions to the code lists themselves cannot change in structure,
only the enumerated sets themselves can change?  Or does it imply a more
sinister pre-requisite knowledge of the entire enumerated set of values
AND the structure and both may be subject to substitution?

I do not see how you can both offer extensibility beyond that while
still preserving inter operability.  I think that looking at what
developers will have to do to access the code list values is important
in order to fully grok the complexity of the problem. My observation
would be to strictly define the logical data model and XML expression
for structure of code lists in order to allow deterministic statements
to be evaluated to retrieve code list values and marshal those into
objects during the parsing process.  For example,you could define an XML
structure that will always give you a List object containing all the
values for codes.  The java could be written like this:
// parsing the schema for enumerated values
    public InputStream[] getDataElementStreams() throws Exception {
        List codes =
this.currentElement.getChildren(CodeValueElement.SOME_FINAL_TOKEN_HERE);
        InputStream[] ret = new InputStream[codes.size()];
        for (int i = 0; i < codes.size(); i++) {
            try {
                ret[i] = new
DataCodeElementRef((Element)codes.get(i)).getInputStream();
            } catch (IOException e) {
                throw new AssemblyException("You wrecked UBL codes
forever....", e);
            }
        }
        return ret;
    }

This would allow a schema parser to interpret the entire substitute code
list as long as the structure rules were followed.  That is about as
deterministic as you can get IMO.

The GoC had some really compelling use cases for conditional validation
of code set values based on qualifiers.  The ability to support their
use case was not present in the current draft of W3C schema however some
issues were fixable by defining a better object model before expressing
it in XML (although I wouldn't want to start yet another elements vs.
attributes holy war).

I did see some cases where there is ambiguity in the UBL code list
specification.  For instance, what is the difference between a code list
identifier, code list name identifier, code list URI and a code list
name text?  The URI to me is a specialized instance of identifier  - I
ponder why more than two are needed.

If you allow changes to the structure, you are doomed.  No one can
effectively process XML if the structure itself is compromised from
instance to instance - that is why we developed DTD's, schemas etc. in
the first place, isn't it??

My $0.02 CAD worth (despite WIlliam thinking the currency is doomed)..

Duane
-- 
***********
Senior Standards Strategist - Adobe Systems, Inc. - http://www.adobe.com
Vice Chair - UN/CEFACT Bureau Plenary - http://www.unece.org/cefact/
Adobe Enterprise Developer Resources  -
http://www.adobe.com/enterprise/developer/main.html
***********
References:
- Re: Code list extensibility and substitution groups
  - From: jon.bosak@sun.com
- Re: [ubl-dev] Re: Code list extensibility and substitution groups
  - From: Duane Nickull <dnickull@adobe.com>