[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: change markup thoughts
At our last DITA TechComm SC telcon, I took the action to write up some background/thoughts about change markup (e.g., namespaced elements versus PIs) for the SC to use as input into an email to send to the DITA TC about the subject. This is my attempt at outlining some of the issues. Apologies if it comes out as somewhat of a "core dump". There can be a variety of changes one might want to mark, and a variety of granularities at which one might want to mark them. A minimum would be to define a fairly large granularity (e.g., a DITA topic, a DocBook chapter, an ATA pageblock) and any time any part of it changes, you mark the pre-changed copy as deleted and the post-changed copy as added. While this sounds (and is) quite the sledge hammer approach, it actually has been used in various applications, especially when the CMS being used only works on large chunks. But the current plan, of course, is to go further. Considering greater granularities, an element can be "changed" in various ways including: 1. being deleted 2. being added 3. having an attribute value changed or an attribute specification added or deleted 4. having a child (text, element, comment, PI) added, deleted, or moved (but still a child of the element in question, or else it would be a deletion from the point of view of the current element) Other more subtle kinds of changes which we may or may not wish to try to capture include taking an element and splitting it into two, taking two consecutive elements and joining them into one, taking an element and--without touching its contents--changing the element name (e.g., changing an ol to a ul), taking some content and adding surrounding markup (e.g., highlighting a word and wrapping it with a phrase element). All of these changes (in fact, just about any change) could be handled by marking a deletion of whatever was old and an addition of whatever is new if we don't feel the need to capture the subtler distinctions. An element can be marked as added or deleted using an attribute (e.g., rev or something else) on the element in question. But it gets harder to indicate changes in attributes using an attribute. And indicating content changes, of course, cannot be done solely with attributes. To indicate changes to the child of an element, such as some text, one can use a special element (e.g., add or delete) that indicates that its contents were added or deleted. The problem with using elements to mark additions or deletions is that, if you use a (non-empty) element, you have to define its content model to be #ANY so that it can wrap almost anything, but then you lose any context enforcement within that element unless your tool has some special processing that allows it to treat the change markup elements as "invisible" with respect to context checking, but this is not standard parsing when using most schema languages (e.g., a DTD). One could use all empty elements for change markup to avoid this issue where one defines the semantics of certain change markup elements to be "start change" and "end change". On the other hand, once we've gotten to this point in our thinking, it might make sense to consider using processing instructions (PIs) for our change markup as several tools have already tended to do. There is a technical description of Arbortext's change markup at http://www.arbortext.com/namespace/atict/change-tracking-markup-spec.htm l This page is written in terms of namespaced elements, but in fact, Arbortext only uses the namespaced elements to mark up XML. When change tracking is done on an SGML document, PIs corresponding to the described namespaced elements are used instead. I'd suggest that the next step be a brief "survey" of existing tools that do some kind of change markup (I think just asking folks in the SC what they know is good enough) to find out what general solutions are out there (e.g., elements, PIs, or something else, and how many different kinds of changes they capture). Then submit to the DITA TC that information along with something like what I've written here to get a sense of what the DITA TC at large is thinking with respect to change tracking. Hopefully that would determine if the next step should be development of element markup or PI markup and how far we should go in trying to capture the various kinds of changes. paul p.s. I will no longer be participating on the OASIS DITA committees, so I will not be able to contribute to this discussion further.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]