OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office-comment] c/o office-collab: Regarding ODF Collaboration.

On Monday 24 January 2011 22:36:05 Florian Reuter wrote:
> Dear Advanced Document Collaboration SC,
> I spent some time adding ODF support to my NativeOOXML rendering
> engine. I finally stumbled upon the ODF Collaboration SC. Since one of
> the design goals of my implementation is real time collaboration a la
> Jupiter (resp. its web-based version Wave) or the newer versions of
> Google Docs I had a closer look.
> The magic behind a state of the art collaboration is a technique
> called “operational transformation”. I really recommend reading
> (http://www.waveprotocol.org/whitepapers/operational-transform).
> Bottom line: You have to have a document model which is only changed
> by clearly defined operations and the operations need to be designed
> in a way that you can apply them in a different order by “transforming
> them”. A very simple example are the following operations:
> Insert(pos=0, “Hello “); Insert(6, “World”) which lead to the document
> “Hello World”. You can change the order of the operations by
> transforming the insert positions. E.g. the sequence of operations
> Insert(pos=0, “World”) Insert(pos=0, “Hello “) would lead to the same
> document “Hello World”. You can show that you can build a quite robust
> (online) collaboration system based on the operational transformation.
> Anyway.
> What I found was odd.
> When I understood correctly the ODF Collaboration SC is going into the
> direction of applying an XML-diff algorithm to the
> ODF/XML-serialization to improve ODF Collaboration.
> Why do I find this odd?
> Well first of all: XML-diff???? Really? Isn't that the wrong layer? I
> can not see how a creator of an --- lets say --- ODF textdocument
> would be interested in the fact that some ODF/XML tags changed? I'd
> rather think that a user would be interested in the actual
> user-imposed changes to the document --- or more precise --- the
> operations applied to the documents by other users.
> Second: I find it rather ironic to use an XML-Diff algorithm for ODF
> collaboration. Especially since the XML-Diff algorithm was invented
> because the “plain text” longest-common-subsequence diff algorithms
> where too generic for XML. (Remember: XML is also text; resp. has a
> text representation). So if we need an XML-Diff because XML is more
> special than plain text, why doesn't the same rule apply to ODF? Don't
> we need a “special” algorithm for ODF? <irony>Since ODF is XML and
> therefore ODF is also text, why don't we apply good old “diff” to
> it?</irony>
> All I'm trying to point out is: Beware --- if you are a hammer
> everything looks like a nail.
> Third: How do we do cool online collaboration with an XML-Diff based
> change tracking?
> 4th: Where is ODF more special than XML (wrt. to change tracking)? The
> normal use case for XML docs is that you get two XML docs (lets say
> from a web-service or a database) and you want to see the difference
> in the trees they represent. (Please note that you are not interested
> in the changes of the textual representation but rather in the change
> of the tree ---- very much like you are not interested in the change
> of the ODF/XML but in the change of text document). In ODF you usually
> have an editor which applies operations to the documents. These
> operations directly represent not only the actual results of the users
> changes but also the history of the changes. So one very remarkable
> difference is that ODF documents are changed by editors which are able
> to track the operations applied by users. This allows very fine
> grained and powerful collaboration.
> An alternative approach to get cool OT-ready collaboration to ODF:
> Very simple:
> (a) Clarify the existing change tracking. Make sure people understand
> that <p>Hello <changed-start/>World<changed-end/></p> represents an
> operation Insert(“World” at the position 6 of the paragraph). With
> that information applications can implement decent OT-based
> collaboration.
> (b) Simply add markup for missing operations like: Insert-Row,
> Delete-Row, Insert-Cell, Delete-Cell, Move-Text, etc. The only
> challenge here is to find a comprehensive list of operations.
> In case you “remain surprised that neither Apple nor Google are taking
> ODF support seriously”
> (http://webmink.com/2011/01/18/apple-and-google-and-odf/) maybe ---
> just maybe ---  some support of a state-of-the-art technology can
> change this.
> Best regards,
> Florian
> P.S.
> I needed to elaborate about operational transformation (OT) a bit to
> make my point clear. However OT is not needed at the ODF layer. Its
> the responsibility of the application and applications don't need to
> implement it if they don't want real-time online collaboration. What
> is needed in ODF is the recording of the operations applied by the
> user and not the recording of the ODF/XML changes.
> P.P.S.
> I think the Delta-XML-diff is a very cool algorithm. I just don't
> think its the right layer of change tracking for ODF documents.
I agree on most of your point, at least as far as real time collaboration 
goes. using xml for that seem rathe odd to me too.
my suggestion would be to use a simple text based OT algo, and as an addon 
layer handle complete transfer of styles (in the form of odf xml)

however for change tracking which is supposed to be file based anyway i think 
deltaxml still makes sort of sense.

anyway i'm just writing this because you echoed thoughts i've been having 
about this.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]