OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-collab message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [office-collab] GCT-Issue-2 (was GCT Issues Wiki page)


Robin LaFontaine <robin.lafontaine@deltaxml.com> wrote on 09/02/2011 
12:10:03 PM:

> 
> I am worried by your comment 
> "a document has many ODF xml representations"
> because if the same document has many representations it is 
> difficult to say when two documents are 'equal' or the same 
> document. If we do not know when they are the same we cannot say 
> what has changed.
> 

There are different ways to define equivalence:

1) Byte-level equivalence

2) Unicode level equivalence (after decoding)

3) Lexical equivalence at the XML level, accounting for ignorable 
whitespace

4) Lexical equivalence at the XML parsed model, so <foo/> and <foo></foo> 
are equivalent, order of attributes is arbitrary, etc.

5) Equivalence via a a canonicalization transformation, e.g., id="foo" and 
id=bar" are equivalent so long as the graph of references is the same.

6) And then you get into the semantic level with possible ways that the 
"same" document could have more than one XML representation.  For example, 
when you write out a spreadsheet that has a single value in A1, how many 
blank rows and columns do you also write out?


> I am making the assumption that if the XML representation of two 
> documents is different then the documents are different. Of course 
> some differences (e.g. text content) are more important that others 
> (e.g. automatic styles). If this basic assumption is wrong then 
> perhaps we need to define a canonical form. Or is there some other 
> way forward?
> 

This would be a problem in a diff application, and application of a 
canonicalization transformation could help.  But I'm not sure this is a 
problem with change tracking.  For example, if  document is edited in app 
A, and then app B loads the same document, and introduces some 
user-directed changes plus some transformations that are insignificant 
(more blank rows in the spreadsheet, for example) then app B should know 
what changes were real and which ones were artifacts and only write out 
change tracking for the changes that it wants to be tracked.

> Robin
> 
> On 26/08/2011 19:46, Andreas J. Guelzow wrote: 
> I had obviously read that message before. Unfortunately we do not even
> seem to agree on the basic concepts. For me a document has many ODF xml
> representations and changing between those representations does not
> represent a document change, so while GCT may be well suited to for
> recording changes in the representation I fail to see how it can be used
> successfully to recognize changes to the document itself. 

> > 
> 
> -- 
> -- -----------------------------------------------------------------
> Robin La Fontaine, Director, DeltaXML Ltd  "Change control for XML"
> T: +44 1684 592 144  E: robin.lafontaine@deltaxml.com 
> http://www.deltaxml.com 
> Registered in England 02528681 Reg. Office: Monsell House, WR8 0QN, UK

> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail. Follow this link to all your TCs in OASIS at: 
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 

S/MIME Cryptographic Signature



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]