OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: ODF 1.2 RDF Metadata, xml:id, and ID deprecation - PROBLEM

I recommend that this analysis be double-checked carefully.

I believe there is a serious defect in the way xml:id is introduced in ODF 1.2 and in how other uses of attributes having ID type are being deprecated.

Because this is relevant to the way the ODF 1.2 Metadata proposal is introduced (and it is partly what led to this research), I believe this topic needs to be given urgent attention.

1. The problem is with the statements of this kind (here from 18.7 on anim:id):

      The anim:id attribute assigns an ID to an animation element.
      The anim:id attribute is deprecated in favor of xml:id. See 18.1278.
      Applications that read documents shall ignore this attribute if 
      there is an xml:id attribute existing for the same element. 
      If no xml:id attribute is existing for the same element, then the 
      anim:id attribute should be processed as it were an xml:id attribute. 
      Applications that write documents may still write anim:id attributes 
      in addition to xml:id attributes. An element shall not have an anim:id
      attribute if there is no xml:id attribute existing for the same element. 
      The value of the anim:id attribute shall equal the value of the xml:id 

There are several difficulties:  

   1.1 The proper rules for identifying XML elements as targets of fragment references (including via IDREF and IDREFS) has to do with the type of the attribute, not its name, whether or not the name is xml:id ( or has "id" in it, although there is a widespread practice of using attributes with "id" as their local name as names of ID valued attributes).

   1.2 [XML 1.0] forbids (as valid XML) elements having more than one attribute of type ID and any two attributes of type ID having the same value in the same XML document (accessing fragments in a different XML document being a separate matter usually handled with a URI).

   1.3 The [XML Names] specification preserves that rule but further constrains attributes of type ID to have values that are lexical NCNames rather than Names (that is, no ":" characters in the values for attributes of type ID).

   1.4 The [xml:id 1.0] specification very carefully preserves the [XML 1.0] requirement.  There is a place that can be read as relaxing the [XML 1.0] rule, but that is not about the validity of the document but about a requirement on processors.

   1.5 The only language that I can find that proposes two attributes taking the same value, one of them being of type ID, is when the others are *not* of type ID.  [This, by the way, is a better way to deprecate those ID-valued attributes from ODF 1.1 so that they don't clash with use of xml:id -- change them to have values of type NCName rather than ID in a way that ODF 1.1 usage remains upward compatible but there is no collision with an xml:id on the same element.]  An example of such language is in [XHTML 1.0] section C8:

      In XML, URI-references [RFC2396] that end with fragment identifiers
      of the form "#foo" do not refer to elements with an attribute 
      name="foo"; rather, they refer to elements with an attribute defined
      to be of type ID, e.g., the id attribute in HTML 4. Many existing 
      HTML clients don't support the use of ID-type attributes in this 
      way, so identical values may be supplied for both of these 
      attributes to ensure maximum forward and backward compatibility 
     (e.g., <a id="foo" name="foo">...</a>).

The suggested use of identical values depends on the fact that the name attribute is *not* of type ID and the id attribute is.  (Also, notice that the discussion is about HTML clients, not XML processors.)
For reference, I have added the specific sources below.  I believe this is situation is easily cured, but we need to align on the cure so that the ODF 1.2 draft and effected proposals can be adjusted and reconciled.

 - Dennis

A. [XML 1.0] RULES

A.1. My source is

[XML 1.0] Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, Fran├žois Yergeau (eds.).  Extensible Markup Language (XML) 1.0 (Fifth Edition), W3C Recommendation 26 November 2008, available at <http://www.w3.org/TR/2008/REC-xml-20081126/>.  (This change has a material impact on what we can expect to find in an NCName, by the way.)

A.2 [Validity constraint: ID] Values of type ID Must match the Name production.  A name MUST NOT appear more than once in an XML document as a value of this type; i.e., ID values MUST uniquely identify the elements which bear them.

A.3 [Validity constraint: ID Attribute Default] An ID attribute must have a declared default of #IMPLIED or #REQUIRED.  {dh: For our purposes #IMPLIED just means no default value is specified and that the attribute is not required.}

B. [XML Names] RULES

B.1 My source is

[XML Names]  Tim Bray, Dave Hollander, Andrew Layman, Richard Tobin (Eds.).  Namespaces in XML 1.0 (Second Edition), W3C Recommendation 16 August 2006, available at <http://www.w3.org/TR/2006/REC-xml-names-20060816/>.

B.2 [XML Names] Makes the additional provision that tokens required for [XML 1.0] well-formedness to conform to the production for Name must conform to the [XML Names] production for NCName instead.  It is explicitly pointed out that this constraint applies to attributes with a declared type of ID, IDREF, IDREFS, ENTITY, ENTITIES, or NOTATION, meaning that they cannot have a value that has any colons.

B.3 [XML Names] also defines conformance requirements for namespace-valid and namespace-well-formed documents and I wonder if there is an appropriate profiling for that in the ODF specification (and the special/curious cases of name-space use in RDF and CURIEs).

C. [xml:id 1.0] RULES

C.1 My source is 

[xml:id 1.0] Jonathan Marsh, Daniel Veillard, Norman Walsh (eds).  xml:id 1.0, W3C Recommendation 9 September 2005, available at <http://www.w3.org/TR/2005/REC-xml-id-20050909/>.

C.2 The [xml:id 1.0] Introduction affirms the [XML 1.0] (and [XML 1.1]) conditions on elements with unique identifiers of type ID:

  * the ID value matches the allowed lexical form,

  * the value is unique within the XML document, and that

  * each element has at most one single unique identifier 

It is pointed out that using xml:id for this has the advantage that it is not necessary to consult the schema to know what attribute might be of type ID and the process of ID type assignment (basically, figuring out where the different ID values occur in the document) becomes straightforward.  This also makes it easy to add an xml:id where there is none (provided there is no other attribute of type ID on the particular element).  It also allows use of xml:id with there being no declaration of that namespace and the use of the attribute.

C.3 The [xml:id 1.0] Terminology section has the following interesting note:

      Note: Application-level processing of IDs, including which elements
      can actually be addressed by which ID values, is beyond the scope of
      the specification.

I take this to apply to the determination of whether the reference to an element by its identification and the element are compatible under the data model of the application, something that the description of an ID type assignment process does not address.  It is not clear that it says anything about what the ID value should be, and maybe not even whether an element can have an xml:id or not.

C.4 The very beginning of the [xml:id 1.0] Syntax section has, as its second sentence, 

      Authors of XML documents are encouraged to name their ID attributes
      "xml:id" to increase the interoperability of these identifiers on 
      the Web.

I take this as reinforcement that this is all about ID, not just xml:id being the desired way to establish one in an element.

C.5 The [xml:id 1.0] Processing xml:id Attributes section provides the following requirement on an xml:id processor (restated here for simplicity):

   * the normalized value of the attribute shall be an NCName in accordance with [XML Names]

   * the declared type of the attribute, it has one, shall be "ID".

C.5 The [xml:id 1.0] Processing section also states that an xml:id processor SHOULD assure that the values of all attributes of type "ID" (which includes all xml:id attributes) within a document are unique.  This does not remove the constraint from the XML Document, and its violation is an xml:id error; the statement is merely that xml:id processor SHOULD (as opposed to SHALL and MAY) assure it.


Here are some of the many element definitions where there is an impact in OpenDocument-v1.2-draft7-13.odt, with introduction of xml:id along with deprecation of other ID-valued attributes:

<text:h> *
<text:p> *
<text:changed-region> *
<draw:page> *
<draw:rect> *
<draw:line> *
<draw:polyline> *
<draw:polygon> *
<draw:regular-polygon> *
<draw:path> *
<draw:circle> *
<draw:ellipse> *
<draw:connector> *
<draw:caption> *
<draw:measure> *
<draw:control> *
<draw:page-thumbnail> *
<draw:g> *
<draw:glue-point> *
<draw:frame> *
<draw:text-box> * 
<dr3d:scene> *
<dr3d:cube> *
<dr3d:sphere> *
<dr3d:extrude> *
<dr3d:rotate> *
<draw:custom-shape> *
<form:text> *
<form:text-area> * <form:textarea> ? *
<form:password> *
<form:file> *
<form:formatted-text> *
<form:number> *
<form:date> *
<form:time> *
<form:fixed-text> *
<form:combobox> *
<form:listbox> *
<form:button> *
<form:image> *
<form:checkbox> * 
<form:radio> *
<form:frame> *
<form:image-frame> *
<form:hidden> *
<form:grid> *
<form:value-range> *
<form:generic-control> *
<office:annotation> *
<anim:par> *
<anim:seq> *
<anim:interate> *
<anim:audio> *
<anim:command> *

 - Dennis

Dennis E. Hamilton
NuovoDoc: Design for Document System Interoperability 
mailto:Dennis.Hamilton@acm.org | gsm:+1-206.779.9430 
http://NuovoDoc.com http://ODMA.info/dev/ http://nfoWorks.org 

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]