[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: The desirability of xml:id stability
In today's call, there was interesting discussion about producers preserving xml:id attributes on elements that are preserved from a document that is being consumed. This is in reference to the proposal of OFFICE-3788: <https://tools.oasis-open.org/issues/browse/OFFICE-3788>. I believe that is a valuable feature for complex document cases, but that it is not a good idea for a .x release of the ODF specification. The ODF 1.0/1.1/1.2 line does not require any such preservation. There is also nothing to prevent an implementation from doing it. So there is room for implementations to determine whether it is important for their use cases. There might be guidance about that, but I don't believe there should be any requirement about it. Absent implementation differentiation becoming a factor in interoperability, it is perhaps not a good idea to suddenly impose this requirement on implementations. It is not clear that the benefit is such that all implementations would be required to preserve xml:id attribute ID values so long as the element having the xml:id occurrence persists. As desirable as this might be from a puristic position, it does damage to many implementations that have never found an use case sufficient to implement this already-allowed capability. For calibration and added perspective, here are three use cases for the preservation of xml:ids. All have problems. These are all for preserving xml:ids for referential integrity of references from outside the document that refer to internal elements of a document (derivative). Accomodating any of them in ODF 1.3 might be a bridge too far. CASE 1: [X]HTML Production. When a document is saved as HTML, the xml:ids are presumably turned into identified anchors. This is necessary simply to allow for internal cross-references by IDREF attribute values that target an xml:id ID value. Changing those ID and IDREF values on editing of a replacement for an existing HTML document will break any deep links into the updated HTML export from anywhere else in the World Wide Web. That may not be acceptable for some usage of ODF implementations as tools for maintaining and producing an HTML rendition. (The same problem arises for user-created bookmarks and the identifiers that are generated for them and cross-references to them.) CASE 2: RDF in the same package and elsewhere. (Not just the RDFa in content.xml itself) ODF 1.2 permits RDF parts to be included in a document that refer into elements of the document structure. These RDF parts need a way to identify the elements being referenced, and fragment IDs in URIs of the RDF terms are the common means. Likewise, when the RDF is extracted from the document (e.g., via a GRDDL procedure) or is otherwise external from a document, that RDF can make use of the ODF Package and OWL Document OWL classes to continue to refer to specific elements internal to the ODF package. To the extent that a revision of the document is logically the same work with respect to the nature of the RDF about it, not preserving fragment IDs becomes a problem. (It is also a challenge to deal with the fact that ODF currently lacks a means for creating a location-independent entity identification of a document. Something is needed for where different occurrences of instances are to be taken as logically the same document. This requires something that can work as a persistent URI or URN for a document that is relatively instance-independent and where the document is not necessarily found only at a unique URL location on the Web.) Finally, it is not to be expected that all implementations will be in a position to adjust RDF within packages to align with changed xml:id ID values in order to perserve the referential integrity from such metadata. Some implementations will simply not deal with such RDF and they may but need not preserve that RDF within the package. (There are pros and cons about this. Having mystery material can be a problem for document safety/security and also for documents that are digitally signed when there is implementation-unknown material.) ODF 1.2 doesn't constrain this and it is difficult to see what ODF 1.3 can do beyond adding some guidance. It is perhaps better for guidance to be worked out and demonstrated at OIC first. That's certainly the case for RDF that is not in the package at all. CASE 3: ODF 1.2 CHANGE TRACKING Depending on how references to portions of documents involving tracked changes happens, there can be a problem with the preservation of xml:id attributes. In ODF 1.0/1.1/1.2 the connection of change information with the places in the document where the change applies is accomplished by the xml:id ID value on a <text:changed-region> element. It is also the case that element start tags with xml:id attributes can be swept up into <text:deletion> elements that carry removed material. Those xml:ids would need to be preserved, since the deletion can be rejected in a later edit. (This situation has remarkable consequences for RDF now referencing an element that is (partially) deleted.) I don't know whether this is comprehended as an edge case for the MCT-based change-tracking for ODF 1.3. AND EDGE CASES There are many edge cases to all of this. There is the interaction with change-tracking (and whether that can synchronize with arbitrary RDF in the package), accessibility (also impacted by change tracking), and probably other provisions, including concerns about covert content and digital signatures. It is also important to note that the xml:id attribute ID values in ODF 1.2 documents are generally not thought to be user-specifiable. Where there are user-specified names, these are in other attributes that are usually not used as attribute values of type ID and IDREF. (Note that this xml:id case should actually be about all ODF 1.x attributes having values of type ID, since uniqueness must be preserved across all of them. The xml:id ones are the only ones automatically accessible via fragment values in URI references.) - Dennis PS: Another cat picture: <http://www.flickr.com/photos/orcmid/1502722674/in/set-72157600230263578>.