OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-collab message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [office-collab] Using RDF for Change Tracking serialization?


I do not think this has been considered. Interesting idea.

Please clarify:

- you have shown how it applies to ac:change attributes, but presumably it could also be applied to the other GCT attributes as well? Then there would just be an ID and RDF referencing this ID and containing all the CT information

- presumably also RDF could be used to represent CT Sets and Stacks?

- you gain ability to query with SPARQL but the original XML could be queried with XQuery and XPath. I do not know the relative merits of these in this situation - any comments?

- if we want to define constraints, e.g. what constitutes a valid delete column change, would this be easier with CT in RDF or as XML?

- presumably some XML infrastructure in content.xml is still needed, for example markers for deleted items and the deleted item itself somewhere else in the document

Regarding your first aside about xml:id attributes - this is a big problem and the only practical solution I have seen is the simple one that requires applications to keep the IDs where possible (cut and paste does as you say require new IDs to be generated). Applications don't want to do that but the problem of matching up changed IDs is very complex and computationally expensive, so IMHO it is best to require that they are preserved. After all the rest of the XML needs to be retained, so why not the ID values? Perhaps the RDF itself could be used to preserve them??

Robin

On 12/05/2011 02:19, monkeyiq wrote:
1305163171.17093.6.camel@alkid.localdomain" type="cite">
Hi,

  Since I know of a few office applications which support RDF, I thought
of perhaps considering leveraging it for change tracking. Apologies if
this has been thought of and dismissed in the past. As far as I know,
OpenOffice, Calligra/KOffice, and abiword all have support for at least
preserving RDF across load/save cycles.

As an aside, since I have been involved in making some of that happen,
the challenges I see when making an implementation support RDF in ODT
are the following:

(1) handling RDF/XML to internal model transition (handling the RDF
    file itself), when xml:id values change in the document 
(2) updating references from the RDF to those nodes,
(3) handling copy and paste, again due to the xml:id needing to be
    unique and when that happens references from the RDF to the pasted
    document content have to be added.

  Anyway back to the point. As there is some implementation support for
RDF I was considering how one might serialize change tracking into RDF
instead of inline in content.xml. I've kept things very simple in this
email, if there is interest I'm happy to expand and explore using other
constructs with RDF too.

Consider this example from the GCT (a fragment generated with abiword),
the "text" of a four word paragraph undergoes a number of style changes
which are represented using ac:changeXXX attributes;

<text:p text:style-name="Normal" 
  delta:insertion-type="insert-with-content" 
  delta:insertion-change-idref="1">This is the 
<text:span text:style-name="T1" delta:insertion-change-idref="4" 
 delta:insertion-type="insert-around-content"
 ac:change1="2,insert,text:style-name," 
 ac:change2="3,modify,text:style-name,T2" 
 ac:change3="4,modify,text:style-name,T4">text</text:span> 
here.</text:p>

Considering only the ac:change attributes, in RDF one might instead see
the change tracking coalesced into an xml:id.

<text:p text:style-name="Normal" 
  delta:insertion-type="insert-with-content" 
  delta:insertion-change-idref="1">This is the 
<text:span xml:id="f009" text:style-name="T1"
delta:insertion-change-idref="4" 
      delta:insertion-type="insert-around-content">text</text:span> 
here.</text:p>

The ac:changes are then serialized as RDF. It might also be advantageous
to split the attribute value into its constituents. Without the
formalities of namespaces and in a more abstract triple format this
might lead to something like:

bnodeA revision  2
bnodeA type      insert
bnodeA attribute text:style-name
bnodeB revision  3
bnodeB type      modify
bnodeB attribute text:style-name
bnodeB oldvalue  T2
bnodeB revision  4
bnodeB type      modify
bnodeB attribute text:style-name
bnodeB oldvalue  T4

Of course this would need to also link bnodeA,B,C subjects back to the
xml:id of f009 in the core document.

There are many advantages that I see do exploring RDF for this purpose;

(1) The change tracking information can have annotations and digital
    signatures applied. It would be quite simple for bnodeA to also
    include a signature for the subgraph ( bnodeA ? ? - bnodeA
    signature ? ), ie all RDF with a subject of bnodeA, sans any
    existing digital signature on the odd change it exists to avoid
    ambiguity.

(2) implementations which do not support change tracking have a
    simpler and smaller document to load.

(3) queries can be run on the change tracking information using SPARQL

(4) The RDF affords applications a scratch space to associate any
    other semantics with changes that might be desired. Since the
    association can be to other RDF it should be resilient to
    implementations which do not know of the additional custom RDF.

One major downside to this approach is that it requires an
implementation to get its hands dirty with some RDF support in order to
support change tracking. There is also the issue that an application not
supporting RDF might break the xml:id links from the RDF to the
document. Though if the change tracking specification does not use RDF
and application which doesn't support change tracking is used to load an
ODF file it too will probably not save the ct information if/when the
document is saved.




---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 


-- 
-- -----------------------------------------------------------------
Robin La Fontaine, Director, DeltaXML Ltd  "Change control for XML"
T: +44 1684 592 144  E: robin.lafontaine@deltaxml.com      
http://www.deltaxml.com      
Registered in England 02528681 Reg. Office: Monsell House, WR8 0QN, UK


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]