tag message

Subject: RE: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values

From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>
To: <stephen.green@documentengineeringservices.com>
Date: Sat, 14 Nov 2009 10:07:04 -0800

I'm concerned we are confusing syntactic well-formedness with semantic
validity.  XML Schema is entirely syntactic, although there is some
semantic-ness around the presumed sense of data types.  There is no
practical way beyond trivial cases to ensure that syntactic-wellformedness
is sufficient for semantic validity.

QName is much more restrictive and specific than normalized attribute
values.  And the XML Schema support for the QName datatype has the
exactly-correct semantics.  I don't understand how this allows "almost any
arbitrary string" and at the same time there is concern that it constrains
the custom values.  QNames restrains the values to being NCNames for local
names of a namespace, but anybody can originate an unambiguous, unique
namespace.  And that namespace definition can provide a mapping to any
arbitrary external code list as part of its definition.

We should stand back and ask ourselves the important architectural question:
Do we intend to permit custom extension of enumerative value sets.  If we
don't, we should require that only the values we define be syntactically
acceptable for a given set.  (I think we should also obligate ourselves to
use NCName as the XML Schema base type to keep ourselves out of trouble.)  

If we do intend to permit custom extensions, it behooves us to ensure that
we have, from the beginning, a provision that allows unambiguous, unique
introduction of custom values using decentralized authority.  We are in a
position to require that these be implementation-defined for conformant use
(i.e., there must be public documentation of the namespace used and the
values that are introduced for the particular attribute case).

The clean choices seem to be either no permissible extensions (because there
is no safe mechanism provided and we want to retain the possibility of
introducing one later) or handling the extension mechanism now.  I'm for
now, so that the community can evolve additional applications and
enumerative values without having to wait for revisions to a TAG
specification.  It also provides benign ways to add to the standardized set
in the future.  This is the best opportunity we will ever have for putting a
stake in the ground here.  

I'd be satisfied in either choice, although my preference is to address
extensibility in TAML 1.0.  I can't imagine that extensions won't be made
and/or asked for especially if there is diverse adoption of the TA Model.

I'm fairly confident that we can't use TAG for OIC work without the prospect
of custom extensions for the peculiar demands around testing
document-processing applications for how they honor a standardized format in
interoperable ways.  So we'd have to use the model but not the TAML.  (We
also use Relax NG and other schema models including, heaven forefend, OWL.)

I also envision applications of my own, around implementation
specifications, that would benefit from the TAG model.  It would be nice to
defer everything to the TA Model and TAML, but I am not constrained to that.

 - Dennis

PS: I got a 502 the first time I accessed the developerWorks URL, but I got
through later.  I don't think there is a contradiction here, although think
Kiel means to be addressing this case in the context of developing or using
a standard-defined schema.  

Note that the union case fails if more than one organization independently
introduces and uses the "x:" prefix and schema-union technique.  The use of
namespaces for disambiguation (the whole point in XML) is an
already-recognized practice.  QName is simply a generalized version of
Kiel's example to deal with decentralized customization of enumerations with
namespaces as the disambiguating authority.  

If we have the requirement that XML Schema validation must be sufficient,
then I think we should not allow anything but restriction to our predefined
terms.  In that case, I would use NCName as the base type.  Of course, QName
is a built-in XML Schema datatype too.

-----Original Message-----
From: stephengreenubl@gmail.com [mailto:stephengreenubl@gmail.com] On Behalf
Of Stephen Green
http://lists.oasis-open.org/archives/tag/200911/msg00029.html
Sent: Friday, November 13, 2009 12:40
To: dennis.hamilton@acm.org
Cc: TAG TC List
Subject: Re: [tag] Proposal: Providing Decentralized Extensiblity of
Enumerative-Attribute Values

There are many principles relevant here. Having spent many
years closely monitoring the UBL TC and Codelist Representation
TC deliberations on this and discussing the same within UBL TC
I have found that the kinds of conclusions in papers such as this
one
http://www.ibm.com/developerworks/xml/library/x-extenum/index.html
have much merit. The problem is one of how to apply the architecture
to which we have already subscribed by electing to use XML Schema.
The schema has to be able to discern certain things, especially it
really MUST be able to validate the existing, built-in codes. Do we
really expect to be extending these codes? Even if we do, we need to
ensure the schema knows the difference between a custom code and
a mistake. QName alone would allow virtually any string to be valid.
I'm not sure that price is one worth paying. Plus QName would perhaps
restrict the custom code values: That might not such be issue unless
such codes are outside of the control of the customizer, as with an
externally defined codelist whose code values might not all be valid as
QNames.
[ ... ]

Follow-Ups:
- Re: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: Stephen Green <stephen.green@documentengineeringservices.com>

References:
- Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>
- Re: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: Stephen Green <stephen.green@documentengineeringservices.com>
- Re: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: Stephen Green <stephen.green@documentengineeringservices.com>
- RE: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>
- Re: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: Stephen Green <stephen.green@documentengineeringservices.com>