tag message

Subject: Re: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
From: Stephen Green <stephen.green@documentengineeringservices.com>
To: dennis.hamilton@acm.org
Date: Fri, 13 Nov 2009 21:18:58 +0000
>>> PS: There is an architectural principle that is vaguely applicable here.
>>> "It is always easier to relax a restriction in the future than there is to
>>> impose a restriction that did not apply previously."  That has a lot of
>>> weight with me.

I have always taken the opposite view - and I'm open to correction here:
If you use normalized string then later decide to restrict it to qualified
name isn't that an acceptable change not requiring a new namespace?
I have always tended to think so. Am I wrong in that?

The principle seems to be, with schemas, that a tightened restriction is
fully in keeping with the previous version. Doesn't this comply with the
architectural principle we all follow of later profiles placing restrictions
not placed previously while not loosening restrictions which are enforced
by the previous version of the parent standard? Otherwise the later version
would break the rules of the previous one. Increasing the restrictions does
not break the rules applied in a previous version, it merely adds to those
rules. Existing instances may then be invalid by the later schema but later
instances valid by the later schema will also be valid by the previous
schema so it can be called a minor version change. It might not necessarily
require a change of namespace (though in the case in point a change
of namespace is arguably needed because a datatype has changed - unless
the restriction could be applied outside of the schema - but that is
debatable, I think).

I still would not expect to have, in a future minor version, to change the
datatype to a QName any more than I would expect to change any other
datatypes to more restricted ones for non-enumerated types. Not everyone
using TAML would want to be forced, say, to use a URI as a certain
identifier. Sometimes an XPath might be more appropriate and if the
datatype is changed from normalized string to any URI it breaks their
implementations. Even doing this restriction up front in the original version
prevents them using XPaths to address an element in an XML document
- they have to use a URI valid according to the xsd:anyURI datatype
(which I'm assuming doesn't accept all XPaths as valid content values).
This is as bad as making the same restriction later. They can't use
TAML and profile it for XPath whether that happens sooner or later. I
don't think we should exclude such profiles unless the reasons are
truly essential so we should keep, wherever possible, to less restricted
datatypes. Those words might come back and bite me later though :-)


Best regards
---
Stephen D Green




2009/11/13 Stephen Green <stephen.green@documentengineeringservices.com>:
> Of course, though for markup such as ours it has
> some major downsides, especially that authors
> using XML editors do not get intellisense offering
> them the codes as they write (leaving open the
> door to harmful typos) - and I guess our markup is
> one which will sometimes be typed by hand like this,
> despite this we could opt for the Codelist Represe-
> ntation TC methodology of providing all of our codes
> in separate artefacts (such as using 'genericode').
> I'd rather not resort to this if we can help it for the
> reasons described above. Even if we did we would
> have still to consider the best datatype to use for
> the base type. I find normative string is best in that
> it allows all possible codes we and others are likely
> to need. The mechanism for extensibility does not
> require more than this, if we define the code values
> externally, because we do not need to use an XSD
> union. We simply treat the enumerated datatype
> the same as any other and normativeString is, IMO,
> optimal where you merely wish to prevent multiline
> content.
>
> Best regards
> ---
> Stephen D Green
>
>
>
>
> 2009/11/13 Stephen Green <stephen.green@documentengineeringservices.com>:
>> There are many principles relevant here. Having spent many
>> years closely monitoring the UBL TC and Codelist Representation
>> TC deliberations on this and discussing the same within UBL TC
>> I have found that the kinds of conclusions in papers such as this
>> one
>> http://www.ibm.com/developerworks/xml/library/x-extenum/index.html
>> have much merit. The problem is one of how to apply the architecture
>> to which we have already subscribed by electing to use XML Schema.
>> The schema has to be able to discern certain things, especially it
>> really MUST be able to validate the existing, built-in codes. Do we
>> really expect to be extending these codes? Even if we do, we need to
>> ensure the schema knows the difference between a custom code and
>> a mistake. QName alone would allow virtually any string to be valid.
>> I'm not sure that price is one worth paying. Plus QName would perhaps
>> restrict the custom code values: That might not such be issue unless
>> such codes are outside of the control of the customizer, as with an
>> externally defined codelist whose code values might not all be valid as
>> QNames.
>>
>> I'm, also, not entirely convinced about adoption by one standard of
>> any other particular standard unnecessarily; that would be akin to
>> breaking the design principles of loose coupling. If the one standard
>> changes or become obsolete the other standard is jeapordised. Plus
>> there may be another standard become prevalent within the lifetime
>> of the markup which is thereby precluded. As long as we don't preclude
>> the said standards I think we should keep away from actually adopting
>> any standard unless it is a core requirement to the markup and I don't
>> think that is so here. Even with RDFa and W3C there seems to be an
>> effort with XHTML/RDFa to avoid any change to XHTML beyond a
>> profile. The profile itself even falls short, apparently, of tight coupling
>> with CURI - it allows CURI but does not, I think, require it. That would
>> be as far as I would want to go but I would not want to make CURI
>> a normative reference. It barely seems core enough to test assertions.
>> If there were a place for that it might be in an RDF representation of
>> the model, if there. The needs to provide a mechanism for extending
>> code values, though we hardly have a strong need to do so, can, I think
>> be met without resorting to normative references. The more normative
>> references we have the less easy it is to adopt/implement the markup.
>>
>> Best regards
>> ---
>> Stephen D Green
>>
>>
>>
>>
>> 2009/11/13 Dennis E. Hamilton <dennis.hamilton@acm.org>:
>>> Thinking about this some more, I think there are a great many reasons to use
>>> a restricted syntax for enumerated-value identifiers.
>>>
>>> Secondly, I think it is valuable to use a syntax that is already defined and
>>> also has appropriate semantics.
>>>
>>> My preference is to use the QualifiedName syntax already nicely defined in
>>> the [xml-names] specification.  In addition, I would add the proviso that
>>> when a PrefixedName is present, it be a valid CURI (eliminating the need for
>>> brackets altogether).  This makes life very easy.  You should not need to
>>> borrow any special syntax in the schema, although QualifiedName does have a
>>> BNF definition.
>>>
>>> Being wide-open with normalized attribute values raises all sorts of
>>> problems with regard to internationalization, preservation in UIs, etc.
>>> (NCName has issues too, but it is a well-delimited case.)
>>>
>>> Being wide-open also raises more complicated cases of deciding when a
>>> special case is present rather than being merely a coincidental use of the
>>> general case.  I think your use of brackets was intended to forestall that,
>>> although I haven't checked it carefully.  I'm inclined to be more
>>> restrictive and much simpler.
>>>
>>>  - Dennis
>>>
>>> PS: There is an architectural principle that is vaguely applicable here.
>>> "It is always easier to relax a restriction in the future than there is to
>>> impose a restriction that did not apply previously."  That has a lot of
>>> weight with me.
>>>
>>> -----Original Message-----
>>> From: stephengreenubl@gmail.com [mailto:stephengreenubl@gmail.com] On Behalf
>>> Of Stephen Green
>>> Sent: Friday, November 13, 2009 09:46
>>> To: dennis.hamilton@acm.org
>>> Cc: TAG TC List
>>> Subject: Re: [tag] Proposal: Providing Decentralized Extensiblity of
>>> Enumerative-Attribute Values
>>>
>>> CORRECTION
>>>
>>> sorry, by
>>> <quote>
>>> but so would this
>>>
>>> level="deprecated"
>>> </quote
>>>
>>> I meant
>>>
>>> <quote>
>>> but so would this
>>>
>>> level="[deprecated]"
>>> </quote>
>>>
>>> Best regards
>>> ---
>>> Stephen D Green
>>>
>>>
>>>
>>>
>>> 2009/11/13 Stephen Green <stephen.green@documentengineeringservices.com>:
>>>> Re: 3.1 (1) / 3.2 (1)
>>>>
>>>> I'm not sure, though I'd need to investigate this some
>>>> more, that restricting the base datatype of enumerations
>>>> beyond normalizedString has a lot of benefit. If the
>>>> datatype is normalizedString then I would have thought
>>>> it can still be used for URIs and QNames.
>>>>
>>>> I propose this as a simple type for custom enumerations
>>>>
>>>>        <xs:simpleType name="codeExtension_type">
>>>>                <xs:restriction base="xs:normalizedString">
>>>>                        <xs:pattern value="\[\S.*\]"/>
>>>>                </xs:restriction>
>>>>        </xs:simpleType>
>>>>
>>>> xs:normalizedString would permit both URIs and QNames.
>>>> The pattern \[\S.*\] would require that a code be surrounded
>>>> by square brackets which should then be easily removed
>>>> to provide a normalized string such as a URI and I hope
>>>> this would be not just consistent with RDFa but with pretty
>>>> much any other enumeration scheme such as a simple
>>>> codelist. The surrounding square brackets just ensure the
>>>> strinng is not one of the built-in TAML codes misspelt, etc.
>>>>
>>>> This would be allowed, for example:
>>>>
>>>> level="[http://docs.oasis-open.org/tag/taml/20100920/deprecated]";
>>>>
>>>> but so would this
>>>>
>>>> level="deprecated"
>>>>
>>>> and for RDFa implementations, etc the "anyAttribute" allows this
>>>>
>>>>
>>>> level="[deprecated]"
>>>>
>>> resource="http://docs.oasis-open.org/tag/taml/20100920/extendedcodelist.xsd";
>>>>
>>>> or presumably whatever other syntax the RDFa requires.
>>>>
>>>> I propose to put this pattern in the next draft schema (if nobody
>>>> objects - we could still change it later of course).
>>>>
>>>> Thanks for good input Dennis. I hope I haven't missed your point.
>>>>
>>>>
>>>> Best regards
>>>> ---
>>>> Stephen D Green
>>>>
>>>>
>>>>
>>>>
>>>> 2009/11/10 Dennis E. Hamilton <dennis.hamilton@acm.org>:
>>>>> I had mentioned this some time ago, and then failed to fulfill the action
>>> item about it.
>>>>>
>>>>> 1. ENVIRONMENT
>>>>>
>>>>> 1.1 This applies to Test Assertion Markup Language as a particular
>>> framework under the model.
>>>>>
>>>>> 1.2 This might also be tied to provisions for unambiguous extension of
>>> terms for certain enumerations in the TA Model as well.
>>>>>
>>>>> 2. OBJECTIVE
>>>>>
>>>>> 2.1 A number of fixed terms are specified for certain attribute-value
>>> choices in the TA Markup Language.
>>>>>
>>>>> 2.2 It is desirable to allow use of custom values for those attributes as
>>> well, but in a way where the custom values are unambiguous and can be chosen
>>> without concern for confusion with the values fixed in the schema or with
>>> values introduced by others.
>>>>>
>>>>> 3. PROPOSAL
>>>>>
>>>>> 3.1 Where there are specific fixed-choices for an attribute where custom
>>> choices are permitted,
>>>>>
>>>>>  (1) restrict the attribute schema to values of type anyURI
>>>>>
>>>>>  (2) specify that the syntax of the fixed choices will always conform to
>>> NCName syntax.
>>>>>
>>>>>  (3) require that custom values for the attribute will always be absolute
>>> URIs.
>>>>>
>>>>>  (4) where convenient, one might also allow a CURIE syntax using
>>> PrefixedName syntax as a shorthand for the absolute URIs.
>>>>>
>>>>>  (5) the fixed values should also have matching absolute URIs using a
>>> namespace that is defined by and for the TA Markup Language.
>>>>>
>>>>> 3.2 Simplified alternative (Recommended for its harmony for use of TAG
>>> terms in RDF):
>>>>>
>>>>>   (1) restrict the attribute schema to values having the syntax of
>>> QualifiedName (XML Namespace syntax)
>>>>>
>>>>>   (2) specify that the syntax of the fixed choices will always conform to
>>> NCName syntax.
>>>>>
>>>>>   (3) required that custom values for the attribute will always be with
>>> Compact URIs (CURIEs).
>>>>>
>>>>>   (4) Define a namespace by which the fixed names can be used in CURIEs
>>> as well as without namespace prefixes (corresponding to (5) above).
>>>>>
>>>>> 4. PRECEDENT
>>>>>
>>>>> These kinds of arrangements are being used to allow for customization in
>>> some XML office-document markup formats and as a way of adding extensibility
>>> to certain attribute-value choices in formats such as HTML and XHTML.
>>>>>
>>>>> [RDFa-XHTML] RDFa in XHTML: Syntax and Processing -- A collection of
>>> attributes and processing rules for extending XHTML to support RDF.  W3C
>>> Recommendation 14 October 2008.  Available at
>>> <http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/>.  See section 2.1 for
>>> the general idea and sections 3.8 and 5.4 for CURIE and URI processing.
>>>  (The peculiar structure of RDF in examples doesn't matter, this is just
>>> about how attribute values are treated as namespaced terms.)
>>>>>
>>>>>
>>>>>
>>>>>  - Dennis
>>>>>
>>>>> Dennis E. Hamilton
>>>>> ------------------
>>>>> NuovoDoc: Design for Document System Interoperability
>>>>> mailto:Dennis.Hamilton@acm.org | gsm:+1-206.779.9430
>>>>> http://NuovoDoc.com http://ODMA.info/dev/ http://nfoWorks.org
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe from this mail list, you must leave the OASIS TC that
>>>>> generates this mail.  Follow this link to all your TCs in OASIS at:
>>>>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>>>>>
>>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe from this mail list, you must leave the OASIS TC that
>>> generates this mail.  Follow this link to all your TCs in OASIS at:
>>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>>>
>>>
>>
>
Follow-Ups:
- Namespace versioning and backward-forward-compatibility issues
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>
References:
- Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>
- Re: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: Stephen Green <stephen.green@documentengineeringservices.com>
- Re: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: Stephen Green <stephen.green@documentengineeringservices.com>
- RE: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>
- Re: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: Stephen Green <stephen.green@documentengineeringservices.com>
- Re: [tag] Proposal: Providing Decentralized Extensiblity of Enumerative-Attribute Values
  - From: Stephen Green <stephen.green@documentengineeringservices.com>