OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-collab message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [office-collab] Change Tracking on the RDF itself: Some initial thoughts...


Related to today's call and discussion of the RDF, I wanted to clarify some things:

1.  In the content.xml file, the only use of RDF is as RDFa.  That is, it is done only using attributes (and not all of the RDFa ones are usable).  Since you can only have one particular RDFa attribute on any element that allows for it, there are ways to use <span>-type elements to introduce more of them.  (Having reification via the RDFa provisions strikes me as unlikely.)

2. In the separate RDF in an ODF 1.2 package (RDF that provides semantic information about the XML parts that constitute the ODF document as such), RDF/XML is used.

 3. There is, of course, nothing to prevent the introduction of RDF/XML elements as children of elements in the content.xml (and other) XML documents of the ODF.  Under the  current schema and conformance targets, these are foreign elements and treated as extensions.  (It appears that this is not allowed in manifest.xml since the rules about extensions and foreign elements were not extended to that case. The top-level manifest.rdf might have some bearing on this case, but it is difficult to know for sure.)

 4. In the current ODF specification, there is nothing on whether and how the metadata carried in these RDF representations is made known to the user of an ODF consumer and how its connection to the ODF document content material is maintained.  Whatever there is about coordinated change-tracking between the ODF document and the RDF tied to it needs to somehow recognize that there is considerable variability here (and the RDF is new to ODF 1.2 and it is not clear to me what level of interoperable implementations there are at this time).

I just wanted to add some context for the discussion about how RDF might be involved as the subject of change tracking.

 - Dennis

-----Original Message-----
From: monkeyiq [mailto:monkeyiq@gmail.com] 
Sent: Sunday, May 29, 2011 04:46
To: office-collab@lists.oasis-open.org
Subject: [office-collab] Change Tracking on the RDF itself: Some initial thoughts...

Hi,
  I thought I'd throw around some ideas and runnable scripts for change tracking the RDF triples. The scripts use the redland library to execute. The design follows my past idea that the RDF model offered by the document API need not be the RDF model that is stored in the ODF file. For example, the RDF in the file might have all triples reified whereas the RDF offered by the API might not expose that if not needed.

  Many of the examples here have technical issues which I realize but put them forward anyway to stimulate discussion. There are some simplifications too in order to focus on the examples and make them more compact. 

Consider the following little RDF using redland:



#!/bin/bash
rm -f test*db
rdfproc test -- add bnode1 "http://www.w3.org/2003/01/geo/wgs84_pos#lat"; "51.47026"
rdfproc test -- add bnode1 "http://www.w3.org/2003/01/geo/wgs84_pos#long"; "-2.59466"
rdfproc test -- add "uri:gollum" "http://xmlns.com/foaf/0.1/name"; "Gollum"
rdfproc test -- add "uri:gollum" "http://xmlns.com/foaf/0.1/phone"; "tel:11 1322342"

The same sort of thing might be specified with change tracking as follows. The first part basically breaks the wgs84 triples into three subject, predicate, object triples. In a formal spec one would have to follow the proper reification, but I've just faked it for the purpose of discussion.

After the two location triples and three for a foaf entry for Gollum are added, all of those reified triples are given a change-tracking version. This might be the same value as the delta:change-id instead of just a number. Along those lines, the RDF would then have to create an order over revisions using the dc:date instead of their numeric values directly.  That is, instead of sorting on the number $VER one would have to lookup the following and sort in it instead: 
totimet(delta:change-transaction@[delta:change-id=$VER]/delta:change-info/dc:date)
It may be convenient for change tracked RDF to replicate a fragment of the content.xml//delta:tracked-changes tree in the manifest.rdf.

Two updates to Gollum's details follow, and a deletion of his home page. The explicit forward and reverse linking of old and new RDF triples is a bit redundant. It is present because I'm playing with SPARQL to see what is the simplest way to get the "current" RDF at a revision.



#!/bin/bash

rm -f cttest*db

rdfproc cttest -- add uri:r1 "uri:subject"   "bnode1"
rdfproc cttest -- add uri:r1 "uri:predicate" "http://www.w3.org/2003/01/geo/wgs84_pos#lat";
rdfproc cttest -- add uri:r1 "uri:object"    "51.47026"

rdfproc cttest -- add uri:r2 "uri:subject"   "bnode1"
rdfproc cttest -- add uri:r2 "uri:predicate" "http://www.w3.org/2003/01/geo/wgs84_pos#long";
rdfproc cttest -- add uri:r2 "uri:object"    "-2.59466"

rdfproc cttest -- add uri:r3 "uri:subject"   "uri:gollum"
rdfproc cttest -- add uri:r3 "uri:predicate" "http://xmlns.com/foaf/0.1/name";
rdfproc cttest -- add uri:r3 "uri:object"    "Gollum"

rdfproc cttest -- add uri:r4 "uri:subject"   "uri:gollum"
rdfproc cttest -- add uri:r4 "uri:predicate" "http://xmlns.com/foaf/0.1/phone";
rdfproc cttest -- add uri:r4 "uri:object"    "tel:11 1322342"

rdfproc cttest -- add uri:r5 "uri:subject"   "uri:gollum"
rdfproc cttest -- add uri:r5 "uri:predicate" "http://xmlns.com/foaf/0.1/homepage";
rdfproc cttest -- add uri:r5 "uri:object"    "http://en.wikipedia.org/wiki/gollum";

rdfproc cttest -- add uri:r1 "uri:delta-change-id"  "1^^xsd:integer"
rdfproc cttest -- add uri:r2 "uri:delta-change-id"  "1^^xsd:integer"
rdfproc cttest -- add uri:r3 "uri:delta-change-id"  "1^^xsd:integer"
rdfproc cttest -- add uri:r4 "uri:delta-change-id"  "1^^xsd:integer"
rdfproc cttest -- add uri:r5 "uri:delta-change-id"  "1^^xsd:integer"

# update Gollum's phone number
rdfproc cttest -- add uri:r6 "uri:subject"          "uri:gollum"
rdfproc cttest -- add uri:r6 "uri:predicate"        "http://xmlns.com/foaf/0.1/phone";
rdfproc cttest -- add uri:r6 "uri:object"           "tel:11 6665534"
rdfproc cttest -- add uri:r6 "uri:delta-change-id"  "2^^xsd:integer"
rdfproc cttest -- add uri:r6 "uri:update"           "uri:r4"
rdfproc cttest -- add uri:r4 "uri:succeddedby"      "uri:r6"

# remove his home page.
rdfproc cttest -- add uri:r7 "uri:delta-change-id"   "3^^xsd:integer"
rdfproc cttest -- add uri:r7 "uri:delete"            "uri:r5"
rdfproc cttest -- add uri:r5 "uri:succeddedby"       "uri:r7"

# update Gollum's phone number
rdfproc cttest -- add uri:r8 "uri:subject"          "uri:gollum"
rdfproc cttest -- add uri:r8 "uri:predicate"        "http://xmlns.com/foaf/0.1/phone";
rdfproc cttest -- add uri:r8 "uri:object"           "tel:11 3232 6665534"
rdfproc cttest -- add uri:r8 "uri:delta-change-id"  "4^^xsd:integer"
rdfproc cttest -- add uri:r8 "uri:update"           "uri:r6"
rdfproc cttest -- add uri:r6 "uri:succeddedby"      "uri:r8"

Using older SPARQL versions makes some of the queries a tad more verbose and convoluted...
http://en.wikibooks.org/wiki/XQuery/SPARQL_Tutorial#Compute_the_maximum_salary

Without yet considering dropping triples which are deleted, I have the following SPARQL which will show the add/modified triples at revision 3 of the document. Note that the optional {} and bound() portion of the query should be simplified in SPARQL 1.1 implementations.

Basically the query seeks the subject, predicate, and object with a version <= a desired number. The optional clause attempts to find the triple which succeeds the current one and has a revision in the valid range. If no such succeeding triple is found (!bound(?nestedver)) then we have the latest version of a triple that is not newer than a given ?ver. This is for the case where one seeks the RDF as it was in the past, ie the current latest version might be 10 but we want to see how things were at revision 3.



#!/bin/bash
rdfproc cttest query - - '
select ?s ?p ?o
where
{
   ?s ?p ?o .
   ?s <uri:delta-change-id> ?ver
   optional { 
             ?s      <uri:succeddedby>     ?sucver .
             ?sucver <uri:delta-change-id> ?nestedver .
             FILTER ( ?nestedver <= "3^^xsd:integer" && ?nestedver > ?ver )
         } .
   filter( !bound(?nestedver) && ?ver <= "3^^xsd:integer" ) } '

The results are as follows. Note that the query has to be updated to respect the uri:delete predicate. Notice that r8 is not present as it has a change-id too new. Also notice that r4 is *not* present as r6 replaces it in change-id=2.



rdfproc: Query returned bindings results:
result: [s=<uri:r1>, p=<uri:object>, o="51.47026"]
result: [s=<uri:r1>, p=<uri:subject>, o="bnode1"]
result: [s=<uri:r1>, p=<uri:predicate>, o=<http://www.w3.org/2003/01/geo/wgs84_pos#lat>]
result: [s=<uri:r1>, p=<uri:delta-change-id>, o="1^^xsd:integer"]
result: [s=<uri:r2>, p=<uri:object>, o="-2.59466"]
result: [s=<uri:r2>, p=<uri:subject>, o="bnode1"]
result: [s=<uri:r2>, p=<uri:predicate>, o=<http://www.w3.org/2003/01/geo/wgs84_pos#long>]
result: [s=<uri:r2>, p=<uri:delta-change-id>, o="1^^xsd:integer"]
result: [s=<uri:r3>, p=<uri:object>, o="Gollum"]
result: [s=<uri:r3>, p=<uri:subject>, o=<uri:gollum>]
result: [s=<uri:r3>, p=<uri:predicate>, o=<http://xmlns.com/foaf/0.1/name>]
result: [s=<uri:r3>, p=<uri:delta-change-id>, o="1^^xsd:integer"]
result: [s=<uri:r6>, p=<uri:object>, o="tel:11 6665534"]
result: [s=<uri:r6>, p=<uri:update>, o=<uri:r4>]
result: [s=<uri:r6>, p=<uri:subject>, o=<uri:gollum>]
result: [s=<uri:r6>, p=<uri:predicate>, o=<http://xmlns.com/foaf/0.1/phone>]
result: [s=<uri:r6>, p=<uri:succeddedby>, o=<uri:r8>]
result: [s=<uri:r6>, p=<uri:delta-change-id>, o="2^^xsd:integer"]
result: [s=<uri:r7>, p=<uri:delete>, o=<uri:r5>]
result: [s=<uri:r7>, p=<uri:delta-change-id>, o="3^^xsd:integer"]

My next move will be to respect deletion in the SPARQL. Note that this is part way there, as r5 is not shown above because it is succeeded by r7 which is itself a delete operation on that triple.





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]