OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-collab message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Some thoughts on Change Tracking


Hi,
  Firstly, apologies for missing the last conference call. 

  Since we are gathering wiki pages for issues with GCT and ECT I
thought I would reiterate some of my thoughts FWIW. I think most of the
points in this email have been discussed before. I am bringing them
together to help make it easier for them to be included and referenced
in any document such as consensus reports etc. Note that I'm just an
implementer, so any thoughts I express below are more or less how I
imagine change tracking might be desired by certain consumers.

  My first line of thought boils down to what the motivation is for
performing change tracking in the first place.

  At a very abstract level, one could store the content.xml et al files
in a git repository and perform rather complex XML or semantic diffs on
the revision history to obtain change tracking without any application
support. Some of the issues raised on the list recently come up here for
semantic diffs. For example, treating styles with equivalent semantics
but different names as equal and thus "not a change". See for example:
http://markmail.org/message/65dsnvrzmbdmjxka

  However, one might like the document editor itself, be that
LibreOffice, Calligra, abiword, or AJAX code, to help you see changes
and offer to negate them, track new changes, or perform other actions.
It then becomes interesting for ODF itself to allow a representation of
changes to be captured in the document file itself. This is as opposed
to storing just the latest state on ODF and performing a diff with an
older complete ODF document.

  This is where I find the ECT bucket concept extremely troublesome. In
order for an application to tell you what changed it must perform a
complex semantic diff between inline content in content.xml (r10) and
the latest bucket (r9). To find what changed in (r9) then another
complex semantic diff is needed between r9 and r8 and so forth. 

  I find this an extremely critical issue has it means that users of
change tracking are relying on various applications to imply changes
rather than being told directly and explicitly what has changed. 

  For government use cases this might be an unacceptable distinction and
as such the use of ODF for change tracking of documents might be
rejected due to uncertainty. This is without considering the
computational complexity of performing these diffs over chains of
buckets to look back 20 revisions.

  Another way ECT buckets are, IMHO, deceptive is that they are proposed
as a means to make things simple. Contrary to this, in abiword loading
and saving an ECT bucket would require a different code path for buckets
to mainstream content. It seems the OOo code would also require such
attention:
http://markmail.org/message/ox5ft4g57tgtyepz
  At least for abiword, changes to an attribute are stored in the data
model inline. Thus it actually becomes harder to implement writing to an
ECT bucket than to just use GCT ac:change attributes during a save. This
is because to write a bucket the code needs to consider a whole rage of
an in memory model and the revision and state attached to everything in
that bucket to figure out what the bucket content will be.

  I reiterate my concern for matched pair XML elements in ECT buckets
(UC7 and UC8). Having the start XML element move to a bucket will cause
problems with matching pair end XML elements. One could at times insert
a new matching end tag, thus splitting two matching elements into four. 

  This also cascades in future changes, and the operation may very well
need to change the names associated either with the new pair or the old
pair of elements to maintain uniqueness. Such a name change itself needs
to be change tracked or at least be able to explicitly and
deterministically imply the link between the two pairs (the renamed
one(s) and the original one) so that GUI elements can offer both to
users wishing to review revisions.

  As I expressed early on, I think it might also be useful for the GCT
to have some conformance levels which list elements, attributes etc
which must be tracked in order for an application to be claim a given
level of conformance. This has been discussed over a few threads on the
list. 

  For longer life documents, some form of change tracking epochs might
be useful:
http://markmail.org/message/64xcxoagwyxy4sn4

  And in any case, I think change tracking on the RDF of the document
should definitely not be a forgotten item:
http://markmail.org/message/4t6zlmiieno2g7on




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]