[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [office-collab] Re: Counting in Access Paths
On Sunday 21 October 2012 20:24:36 PM Dennis E. Hamilton wrote: > Let's get back to the subject at hand. > > 1. The use of one-way absolute (i.e., position-rigid) hierarchical > references are brittle and cannot survive processing that alters such > paths without requiring any processing of the markup (at the markup level) > to find and adjust the references, delete the references, or leave the > document in a damaged condition. > > 2. The use of unique ID values in a bidirectionally-verifiable linkage > does not have those problems and is a technique already required of ODF > processors. > > 3. I don't believe (1) is *essential* to MCT. If it is essential, I > would say that MCT is fatally flawed. > > 4. What is the technical barrier to using a mechanism like (2) for MCT? > Is it so objectionable that there are empty elements that serve as markers > in the texts that have tracked changes? > > 5. The making of untracked changes in content that is subject to a > tracked change is a problem in all approaches. It seems that (2) is safer > and there may be further safeguards so that a change is not misattributed. > That's different than having the document be broken, as would happen with > (1). > > I am unclear what the attachment to absolute positional offsets comes from. > Is it really essential? So far I have been partial to absolute positioning. This partiality probably originates in my reading of the Google Wave documentation and my having a tendency to avoid being verbose in serialization. I have an open mind to the idea of using xml:id though. It certainly could make CT more robust. The absolute positions would indeed require updating when inserting nodes higher in the hierarchy, when moving nodes and when removing nodes if these addresses were used in places other than change tracking. The references to labelled nodes would only need updating when nodes are removed, at which point the there should not be any references left anyway. The references as used in change tracking only apply to one version of the document. To have them point to the same node in a different document, the change operation would be updated (transformed) to point the correct node. The addressing should support very easy updating of addresses in change operation transformations. In Google Wave, the addresses are one-dimensional. i.e. not hierarchical and are easy to update. The hierarchical addresses in the current MCT seem more laborious at first glance. Care should be taken that the addresses can be transformed without information about the document. The rule to update the addresses in any of the operations seems often to only involve changing the last number if both change operations point to an equally deeply nested address. (The hierarchical addressing rules depend on the @type attribute which I have not fully grasped yet. For example how would a paragraph in a cell in a table in a frame in a paragraph be addressed?) When using xml:id, the rules for transforming change operations are less obvious to me. Let's take the example similar to the one in the MCT-Merge- enabled-Change-Tracking-wd05 presentation: <do> <add type=”paragraph” s="/1">Aruba</add> <add type=”paragraph” s="/2">Curaçao</add> <add type=”paragraph” s="/2">Bonaire</add> </do> This would result in: <p>Aruba</p> <p>Bonaire</p> <p>Curaçao</p> The position in which the paragraphs are added is clear. How would these rules look when xml:id is used instead of positions? I could imagine something like this: <do> <add type=”paragraph” identifier="a">Aruba</add> <add type=”paragraph” identifier="c" after="#a">Curaçao</add> <add type=”paragraph” identifier="b" after="#a">Bonaire</add> </do> An advantage of addressing by position is that it would allow change operations to be transformed without any document information. Addressing with xml:id's does not have that advantage. Here is an example that starts from a document (no namespaces for brevity): <p xml:id="a"> <frame xml:id="b"> <text-box xml:id="c"> <p xml:id="d">hello</p> </text-box> </frame> </p> The changes are: add a paragraph after the paragraph with 'hello' and delete the paragraph with the frame. <do> <add type="paragraph" s="/1/1/1/2">world</add> <del type="paragraph" s="/1"/> </do> Swapping the two change operations would make one redunant, leaving just <do> <del type="paragraph" s="/1"/> </do> When using xml:id, it is not clear from the addresses alone, how the change operations should be transformed. The original set of change would be describe in this way: <do> <add type="paragraph" identifier="e" after="#d"/> <del s="#a"/> </do> (Presumably the @type attribute would be redundant in a <del/> operation.) It is not clear from the element identifiers, that the <add/> operation would be nullified by the <del/> operation. Information about the document hierarchy would needed to see this. The position based addresses are only valid in the point they take up in the list of changes. If change operations are moved to a different position in that list, they the addresses may change. The use of the addresses is only to describe the location of the change. One could argue that an advantage of the xml:id would be that you can quickly find all changes that are applied to a certain paragraph by looking at the identifier mentioned in the change. But that is actually not true. If a range of paragraphs is deleted, the identifier will not be mentioned. If a span in the paragraph is modified, it will not be seen. On the other hand with positional addressing, one can find the list of operations on one paragraph quickly. When searching for it, one needs to go through the list of changes and update the address that you are looking for while moving through the list. > From: office-collab@lists.oasis-open.org > [mailto:office-collab@lists.oasis-open.org] On Behalf Of Robin LaFontaine > Sent: Thursday, October 18, 2012 2:05 AM > To: office-collab@lists.oasis-open.org > Subject: Re: [office-collab] Re: Counting in Access Paths > > I agree with your concerns, Dennis, and am surprised implementors do not > seem to worry about this issue. OT has, we are told, been proved to work > in a forwards direction when by definition all the 'edits' will be tracked > and executed. Working backwards (from existing to previous version, i.e. > tracked change) when not all the changes will be tracked, and not all > accepted, is a different problem, IMHO. However, it is simple enough > (though it will need a good bit of effort) to demonstrate whether or not > this is a real concern - we need a specification of how it works and some > implementations! With 'working backwards', I assume you mean 'finding a changeset that transforms one document into another document'. I think it would be possible to find such a set of changes for MCT too. MCT normally works by recording the actual edit operations, so when reconstructing changes from two documents the changes are not actual edits but inferred edits. Nevertheless, inferring a list of changes would be possible with MCT too. > On 17/10/2012 21:38, Dennis E. Hamilton wrote: > > I'm having a terminology problem here. > > As far as I'm concerned, s="/2/10" and e="/3/18" are absolute addresses. > Hierarchical, but absolute, not unlike in absolute URLs. It's more > brittle than in a URL because it is based on counted position, not on > labeled hierarchy nodes. That means insertion and deletion of siblings at > every level requires these references to be repaired. That's scary. The references are used only in the current change operation. They should not occur in other places. > Does not the MCT use of these rigid absolute paths require that the > document be serialized before the tracking information can be serialized? This depends on the meaning of the numbers in the absolute paths. If the numbers refer to the XML nodes, then yes. If they refer to ODF concepts like paragraphs and tables, then this would not be needed, but a thorough documentation of the addressing method would be needed. This is still some work and I'd like to read up on the latest document that does this. What/where is the currently latest version? > And for a consumer presenting the content, is it necessary to find the > relevant change-tracking information by some synchronization method? My > concern is that it is impossible to detect when synchronization has been > broken. Everything has to be absolutely perfect and there is no way to > touch a change-tracked document without having to adjust all of the > tracking locations. That's a considerable burden. The change operations are only valid for one version of the document. If the document is modified, the changes do not apply any more at all. To make them apply, the changes between the document for which they apply and the document that is the result of more user editing would need to be inferred. So if a user sends a revision of a document without recording the changes and reports this document to the be merged with the last version for which the changes were recorded, then on could create a new version and attribute changes. > I am only addressing the cross-identification approach here. It appears > that ODF CT does this in a more robust way; I don't see why MCT can't be > made at least as resilient. > > I think there does need to be a stretchy way to connect between tracked > details and the point of change. An xml:id ID value and a corresponding > IDREF attribute value do this perfectly for XML-modeled persistent > document formats. And this kind of support already has to exist in > ODF-based processors simply because of the many ways that > cross-referencing is handled by identifiers of some type, including IRIs > that refer into package parts by fragment identifiers. Could you explain what the cross-identification approach is? Well, that was a long mail. I thought the meeting was today and had time planned which allowed me to write so much. Cheers, Jos
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]