office-collab message

Subject: Re: Counting in Access Paths
From: Svante Schubert <svante.schubert@gmail.com>
To: "office-collab@lists.oasis-open.org" <office-collab@lists.oasis-open.org>
Date: Wed, 17 Oct 2012 18:21:14 +0200
On 16.10.2012 22:10, Dennis E. Hamilton wrote:
> This is a separate topic, although it certainly figures in how various interop challenges will be handled.
>
> It occurs to me that counting has some disconcerting consequences.
>
> I mean this kind of thing:
>
>      <del s="/2/10" e="/3/18" />
>      <merge s="/2" e="/3" />
>
> There are some interesting consequences:
>
>  1. It is nearly impossible to do these manually (as when fabricating tests, examples, etc.).  That goes to creating them and also checking them manually in forensic work.
Basically the path language is a simplification of an W3C XPath path and
I never heard someone complaining about this to be forensic work. Was
quite a success if I remember it.
Made even a instant empiric sanity check and asked randomly some people
here at the ODF Plugfest in Berlin and they very much do liked it and no
negative comment or even a complain about it at all.
>  2. Is there a presumed canonicalization when it comes down to counting in text content? And how are component elements
>     counted?  I assume they count as 1.
Yes, every component count as one, as every text character is a
component they are counted accordingly. Original Operational
Transformation counts the gaps between text starting with zero. In
addition computer science usually start counting with 0, but XML counts
the components and start with 1 and human languages start with 1 - not
with 0 - and as the serialization is in XML and meant to be readable by
humans it starts with 1. Read an operation <add type="paragraph" s="/3"
/> add a paragraph at the 3rd position. 
>  3. This seems to be very brittle.  That is, anything that is done by some tracking-negligent tool that changes the offset of material that is touched by tracking will completely break the tracked change, with no detection that it happened.  In the ODF scheme, as one example, there is much more resilience in the making of alterations that are unrelated to the change.  Even when there is some sort of collision, there is more information that may help resolve it, or at least determine that some of the tracking cannot be relied upon any longer.  That it may not be possible for a consumer to even detect the disconnect is worrisome.
Quite the opposite, it is far from optimal to use an (stable) absolute
positioning either via ID or (even more stable) directly nesting of
changes into the content. By doing so, it would be necessary to read the
complete content before identifying the changes.
The efficiency/time of merges would relate to the document size and it
would not be possible to sent only the changes of a document to someone
else, who is working as well on the same document.
Not to mention that merges are based on OT, which require relative
referencing. With relative references someone might even be able to
edit/document (proposed changes) on a read-only ODF document, might it
be a signed ODF document, or a document on a web-server somewhere.
Change-Tracking similar to ODF signature & encryption can only be used
by applications supporting it. No need to avoid an advanced technology
if it can be destroyed by a text editor. The user, who edits ODF via
text editor should know what is going on.
Still similar to the revision systems (e.g. git), we might want use
signatures (e.g. SHA-1) to verify if a stored (XML) file was changed.
>
> This is not an objection to MCT in principle, it is simply an objection to the difficulty and the apparent lack of resilience in the scheme by which the tracking is connected to the text that it applies to.  One can also argue that this is not in the spirit of XML-based models at all.
Well, you might be right that operations are not fully representing the
spirit of the XML-based model, but on the other hand operations are
representing the spirit of distributed work. Sooner or later ODF
applications need to solve real-time collaboration, merges by advanced
techniques as Operational Transformation. XML is unfortunately not alone
the hammer for this nail. Nevertheless we are still serializing the
operations into XML. I do very much like XML, but we need to use a
technology where it is suited for.
And difficult? The LibreOffice developers listening to my MCT
presentation, were quite excited. Even Michael Stahl - who implemented
RDF Metadata in OpenOffice and is always very skeptical - told me
afterwards, that this might work!
As long the implementers like it, I am happy.

Interesting view you are representing, Dennis.
Svante
>
>  - Dennis
>
> -----Original Message-----
> From: office-collab@lists.oasis-open.org [mailto:office-collab@lists.oasis-open.org] On Behalf Of Svante Schubert
> Sent: Tuesday, October 16, 2012 03:04
> To: dennis.hamilton@acm.org
> Cc: office-collab@lists.oasis-open.org
> Subject: Re: [office-collab] Paragraph merge in ODF (earlier - Re: [office-collab] FW: [office] Groups - MCT Challenge #1 Documents (Zip) uploaded)
>
> [ ... ]
>
> The heading is the first component, you start delete text from the second to the third component, which serialized MCT operations for the change of your challenge might be:
> <del s="/2/10" e="/3/18" />
> <merge s="/2" e="/3" /> 
> The above are NOT the undo operations, but the operations that describe your change. The undo will follow as soon we agree on what is being changed in the XML and we (or I) need to think over how to handle styles in general (I will be on the ODF plugfest tomorrow and LibreOffice conference after, so I might have to pause this thread till next week).
>
> I even would omit the second parameter for the merge as only sibling paragraphs can be merged.  
>
> [ ... ]
>
>
Follow-Ups:
- RE: [office-collab] Re: Counting in Access Paths
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>
- Re: [office-collab] Re: Counting in Access Paths
  - From: Patrick Durusau <patrick@durusau.net>
References:
- Counting in Access Paths
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>