OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

dita-techcomm message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: change markup thoughts

At our last DITA TechComm SC telcon, I took the action
to write up some background/thoughts about change
markup (e.g., namespaced elements versus PIs) for
the SC to use as input into an email to send to the
DITA TC about the subject.  This is my attempt at
outlining some of the issues.  Apologies if it comes
out as somewhat of a "core dump".

There can be a variety of changes one might want to mark, 
and a variety of granularities at which one might want 
to mark them.  

A minimum would be to define a fairly large granularity 
(e.g., a DITA topic, a DocBook chapter, an ATA pageblock)
and any time any part of it changes, you mark the pre-changed
copy as deleted and the post-changed copy as added.  While
this sounds (and is) quite the sledge hammer approach, it
actually has been used in various applications, especially
when the CMS being used only works on large chunks.  But the
current plan, of course, is to go further.

Considering greater granularities, an element can be "changed"
in various ways including:

1.  being deleted

2.  being added

3.  having an attribute value changed or an attribute
    specification added or deleted

4.  having a child (text, element, comment, PI) added,
    deleted, or moved (but still a child of the element
    in question, or else it would be a deletion from the
    point of view of the current element)

Other more subtle kinds of changes which we may or may
not wish to try to capture include taking an element
and splitting it into two, taking two consecutive elements
and joining them into one, taking an element and--without
touching its contents--changing the element name (e.g.,
changing an ol to a ul), taking some content and adding
surrounding markup (e.g., highlighting a word and wrapping
it with a phrase element).  All of these changes (in fact,
just about any change) could be handled by marking a
deletion of whatever was old and an addition of whatever
is new if we don't feel the need to capture the subtler

An element can be marked as added or deleted using an
attribute (e.g., rev or something else) on the element
in question.  But it gets harder to indicate changes in
attributes using an attribute.  And indicating content
changes, of course, cannot be done solely with attributes.

To indicate changes to the child of an element, such as
some text, one can use a special element (e.g., add or 
delete) that indicates that its contents were added or 
deleted.  The problem with using elements to mark additions 
or deletions is that, if you use a (non-empty) element, 
you have to define its content model to be #ANY so that 
it can wrap almost anything, but then you lose any context 
enforcement within that element unless your tool has some
special processing that allows it to treat the change
markup elements as "invisible" with respect to context 
checking, but this is not standard parsing when using most 
schema languages (e.g., a DTD).

One could use all empty elements for change markup to avoid 
this issue where one defines the semantics of certain change 
markup elements to be "start change" and "end change".  

On the other hand, once we've gotten to this point in our 
thinking, it might make sense to consider using processing 
instructions (PIs) for our change markup as several tools
have already tended to do.

There is a technical description of Arbortext's change markup at
This page is written in terms of namespaced elements, but in fact,
Arbortext only uses the namespaced elements to mark up XML.  When
change tracking is done on an SGML document, PIs corresponding
to the described namespaced elements are used instead.  

I'd suggest that the next step be a brief "survey" of existing
tools that do some kind of change markup (I think just asking
folks in the SC what they know is good enough) to find out 
what general solutions are out there (e.g., elements, PIs,
or something else, and how many different kinds of changes
they capture).  Then submit to the DITA TC that information
along with something like what I've written here to get a sense
of what the DITA TC at large is thinking with respect to change
tracking.  Hopefully that would determine if the next step should
be development of element markup or PI markup and how far we
should go in trying to capture the various kinds of changes.


p.s.  I will no longer be participating on the OASIS DITA
committees, so I will not be able to contribute to this
discussion further.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]