OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: c/o office-collab: Regarding ODF Collaboration.


Dear Advanced Document Collaboration SC,

I spent some time adding ODF support to my NativeOOXML rendering
engine. I finally stumbled upon the ODF Collaboration SC. Since one of
the design goals of my implementation is real time collaboration a la
Jupiter (resp. its web-based version Wave) or the newer versions of
Google Docs I had a closer look.
The magic behind a state of the art collaboration is a technique
called “operational transformation”. I really recommend reading
(http://www.waveprotocol.org/whitepapers/operational-transform).
Bottom line: You have to have a document model which is only changed
by clearly defined operations and the operations need to be designed
in a way that you can apply them in a different order by “transforming
them”. A very simple example are the following operations:
Insert(pos=0, “Hello “); Insert(6, “World”) which lead to the document
“Hello World”. You can change the order of the operations by
transforming the insert positions. E.g. the sequence of operations
Insert(pos=0, “World”) Insert(pos=0, “Hello “) would lead to the same
document “Hello World”. You can show that you can build a quite robust
(online) collaboration system based on the operational transformation.

Anyway.
What I found was odd.

When I understood correctly the ODF Collaboration SC is going into the
direction of applying an XML-diff algorithm to the
ODF/XML-serialization to improve ODF Collaboration.

Why do I find this odd?

Well first of all: XML-diff???? Really? Isn't that the wrong layer? I
can not see how a creator of an --- lets say --- ODF textdocument
would be interested in the fact that some ODF/XML tags changed? I'd
rather think that a user would be interested in the actual
user-imposed changes to the document --- or more precise --- the
operations applied to the documents by other users.

Second: I find it rather ironic to use an XML-Diff algorithm for ODF
collaboration. Especially since the XML-Diff algorithm was invented
because the “plain text” longest-common-subsequence diff algorithms
where too generic for XML. (Remember: XML is also text; resp. has a
text representation). So if we need an XML-Diff because XML is more
special than plain text, why doesn't the same rule apply to ODF? Don't
we need a “special” algorithm for ODF? <irony>Since ODF is XML and
therefore ODF is also text, why don't we apply good old “diff” to
it?</irony>
All I'm trying to point out is: Beware --- if you are a hammer
everything looks like a nail.

Third: How do we do cool online collaboration with an XML-Diff based
change tracking?

4th: Where is ODF more special than XML (wrt. to change tracking)? The
normal use case for XML docs is that you get two XML docs (lets say
from a web-service or a database) and you want to see the difference
in the trees they represent. (Please note that you are not interested
in the changes of the textual representation but rather in the change
of the tree ---- very much like you are not interested in the change
of the ODF/XML but in the change of text document). In ODF you usually
have an editor which applies operations to the documents. These
operations directly represent not only the actual results of the users
changes but also the history of the changes. So one very remarkable
difference is that ODF documents are changed by editors which are able
to track the operations applied by users. This allows very fine
grained and powerful collaboration.

An alternative approach to get cool OT-ready collaboration to ODF:

Very simple:
(a) Clarify the existing change tracking. Make sure people understand
that <p>Hello <changed-start/>World<changed-end/></p> represents an
operation Insert(“World” at the position 6 of the paragraph). With
that information applications can implement decent OT-based
collaboration.
(b) Simply add markup for missing operations like: Insert-Row,
Delete-Row, Insert-Cell, Delete-Cell, Move-Text, etc. The only
challenge here is to find a comprehensive list of operations.

In case you “remain surprised that neither Apple nor Google are taking
ODF support seriously”
(http://webmink.com/2011/01/18/apple-and-google-and-odf/) maybe ---
just maybe ---  some support of a state-of-the-art technology can
change this.

Best regards,

Florian


P.S.
I needed to elaborate about operational transformation (OT) a bit to
make my point clear. However OT is not needed at the ODF layer. Its
the responsibility of the application and applications don't need to
implement it if they don't want real-time online collaboration. What
is needed in ODF is the recording of the operations applied by the
user and not the recording of the ODF/XML changes.

P.P.S.
I think the Delta-XML-diff is a very cool algorithm. I just don't
think its the right layer of change tracking for ODF documents.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]