OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-collab message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Change Tracking on the RDF itself: Some initial thoughts...


Hi,
  I thought I'd throw around some ideas and runnable scripts for change tracking the RDF triples. The scripts use the redland library to execute. The design follows my past idea that the RDF model offered by the document API need not be the RDF model that is stored in the ODF file. For example, the RDF in the file might have all triples reified whereas the RDF offered by the API might not expose that if not needed.

  Many of the examples here have technical issues which I realize but put them forward anyway to stimulate discussion. There are some simplifications too in order to focus on the examples and make them more compact.

Consider the following little RDF using redland:

#!/bin/bash
rm -f test*db
rdfproc test -- add bnode1 "http://www.w3.org/2003/01/geo/wgs84_pos#lat" "51.47026"
rdfproc test -- add bnode1 "http://www.w3.org/2003/01/geo/wgs84_pos#long" "-2.59466"
rdfproc test -- add "uri:gollum" "http://xmlns.com/foaf/0.1/name" "Gollum"
rdfproc test -- add "uri:gollum" "http://xmlns.com/foaf/0.1/phone" "tel:11 1322342"

The same sort of thing might be specified with change tracking as follows. The first part basically breaks the wgs84 triples into three subject, predicate, object triples. In a formal spec one would have to follow the proper reification, but I've just faked it for the purpose of discussion.

After the two location triples and three for a foaf entry for Gollum are added, all of those reified triples are given a change-tracking version. This might be the same value as the delta:change-id instead of just a number. Along those lines, the RDF would then have to create an order over revisions using the dc:date instead of their numeric values directly.  That is, instead of sorting on the number $VER one would have to lookup the following and sort in it instead:
totimet(delta:change-transaction@[delta:change-id=$VER]/delta:change-info/dc:date)
It may be convenient for change tracked RDF to replicate a fragment of the content.xml//delta:tracked-changes tree in the manifest.rdf.

Two updates to Gollum's details follow, and a deletion of his home page. The explicit forward and reverse linking of old and new RDF triples is a bit redundant. It is present because I'm playing with SPARQL to see what is the simplest way to get the "current" RDF at a revision.

#!/bin/bash

rm -f cttest*db

rdfproc cttest -- add uri:r1 "uri:subject"   "bnode1"
rdfproc cttest -- add uri:r1 "uri:predicate" "http://www.w3.org/2003/01/geo/wgs84_pos#lat"
rdfproc cttest -- add uri:r1 "uri:object"    "51.47026"

rdfproc cttest -- add uri:r2 "uri:subject"   "bnode1"
rdfproc cttest -- add uri:r2 "uri:predicate" "http://www.w3.org/2003/01/geo/wgs84_pos#long"
rdfproc cttest -- add uri:r2 "uri:object"    "-2.59466"

rdfproc cttest -- add uri:r3 "uri:subject"   "uri:gollum"
rdfproc cttest -- add uri:r3 "uri:predicate" "http://xmlns.com/foaf/0.1/name"
rdfproc cttest -- add uri:r3 "uri:object"    "Gollum"

rdfproc cttest -- add uri:r4 "uri:subject"   "uri:gollum"
rdfproc cttest -- add uri:r4 "uri:predicate" "http://xmlns.com/foaf/0.1/phone"
rdfproc cttest -- add uri:r4 "uri:object"    "tel:11 1322342"

rdfproc cttest -- add uri:r5 "uri:subject"   "uri:gollum"
rdfproc cttest -- add uri:r5 "uri:predicate" "http://xmlns.com/foaf/0.1/homepage"
rdfproc cttest -- add uri:r5 "uri:object"    "http://en.wikipedia.org/wiki/gollum"

rdfproc cttest -- add uri:r1 "uri:delta-change-id"  "1^^xsd:integer"
rdfproc cttest -- add uri:r2 "uri:delta-change-id"  "1^^xsd:integer"
rdfproc cttest -- add uri:r3 "uri:delta-change-id"  "1^^xsd:integer"
rdfproc cttest -- add uri:r4 "uri:delta-change-id"  "1^^xsd:integer"
rdfproc cttest -- add uri:r5 "uri:delta-change-id"  "1^^xsd:integer"

# update Gollum's phone number
rdfproc cttest -- add uri:r6 "uri:subject"          "uri:gollum"
rdfproc cttest -- add uri:r6 "uri:predicate"        "http://xmlns.com/foaf/0.1/phone"
rdfproc cttest -- add uri:r6 "uri:object"           "tel:11 6665534"
rdfproc cttest -- add uri:r6 "uri:delta-change-id"  "2^^xsd:integer"
rdfproc cttest -- add uri:r6 "uri:update"           "uri:r4"
rdfproc cttest -- add uri:r4 "uri:succeddedby"      "uri:r6"

# remove his home page.
rdfproc cttest -- add uri:r7 "uri:delta-change-id"   "3^^xsd:integer"
rdfproc cttest -- add uri:r7 "uri:delete"            "uri:r5"
rdfproc cttest -- add uri:r5 "uri:succeddedby"       "uri:r7"

# update Gollum's phone number
rdfproc cttest -- add uri:r8 "uri:subject"          "uri:gollum"
rdfproc cttest -- add uri:r8 "uri:predicate"        "http://xmlns.com/foaf/0.1/phone"
rdfproc cttest -- add uri:r8 "uri:object"           "tel:11 3232 6665534"
rdfproc cttest -- add uri:r8 "uri:delta-change-id"  "4^^xsd:integer"
rdfproc cttest -- add uri:r8 "uri:update"           "uri:r6"
rdfproc cttest -- add uri:r6 "uri:succeddedby"      "uri:r8"

Using older SPARQL versions makes some of the queries a tad more verbose and convoluted...
http://en.wikibooks.org/wiki/XQuery/SPARQL_Tutorial#Compute_the_maximum_salary

Without yet considering dropping triples which are deleted, I have the following SPARQL which will show the add/modified triples at revision 3 of the document. Note that the optional {} and bound() portion of the query should be simplified in SPARQL 1.1 implementations.

Basically the query seeks the subject, predicate, and object with a version <= a desired number. The optional clause attempts to find the triple which succeeds the current one and has a revision in the valid range. If no such succeeding triple is found (!bound(?nestedver)) then we have the latest version of a triple that is not newer than a given ?ver. This is for the case where one seeks the RDF as it was in the past, ie the current latest version might be 10 but we want to see how things were at revision 3.

#!/bin/bash
rdfproc cttest query - - '
select ?s ?p ?o 
where 
{
   ?s ?p ?o .
   ?s <uri:delta-change-id> ?ver
   optional { 
             ?s      <uri:succeddedby>     ?sucver .
             ?sucver <uri:delta-change-id> ?nestedver .
             FILTER ( ?nestedver <= "3^^xsd:integer" && ?nestedver > ?ver )
         } .
   filter( !bound(?nestedver) && ?ver <= "3^^xsd:integer" )
}
'

The results are as follows. Note that the query has to be updated to respect the uri:delete predicate. Notice that r8 is not present as it has a change-id too new. Also notice that r4 is *not* present as r6 replaces it in change-id=2.

rdfproc: Query returned bindings results:
result: [s=<uri:r1>, p=<uri:object>, o="51.47026"]
result: [s=<uri:r1>, p=<uri:subject>, o="bnode1"]
result: [s=<uri:r1>, p=<uri:predicate>, o=<http://www.w3.org/2003/01/geo/wgs84_pos#lat>]
result: [s=<uri:r1>, p=<uri:delta-change-id>, o="1^^xsd:integer"]
result: [s=<uri:r2>, p=<uri:object>, o="-2.59466"]
result: [s=<uri:r2>, p=<uri:subject>, o="bnode1"]
result: [s=<uri:r2>, p=<uri:predicate>, o=<http://www.w3.org/2003/01/geo/wgs84_pos#long>]
result: [s=<uri:r2>, p=<uri:delta-change-id>, o="1^^xsd:integer"]
result: [s=<uri:r3>, p=<uri:object>, o="Gollum"]
result: [s=<uri:r3>, p=<uri:subject>, o=<uri:gollum>]
result: [s=<uri:r3>, p=<uri:predicate>, o=<http://xmlns.com/foaf/0.1/name>]
result: [s=<uri:r3>, p=<uri:delta-change-id>, o="1^^xsd:integer"]
result: [s=<uri:r6>, p=<uri:object>, o="tel:11 6665534"]
result: [s=<uri:r6>, p=<uri:update>, o=<uri:r4>]
result: [s=<uri:r6>, p=<uri:subject>, o=<uri:gollum>]
result: [s=<uri:r6>, p=<uri:predicate>, o=<http://xmlns.com/foaf/0.1/phone>]
result: [s=<uri:r6>, p=<uri:succeddedby>, o=<uri:r8>]
result: [s=<uri:r6>, p=<uri:delta-change-id>, o="2^^xsd:integer"]
result: [s=<uri:r7>, p=<uri:delete>, o=<uri:r5>]
result: [s=<uri:r7>, p=<uri:delta-change-id>, o="3^^xsd:integer"]

My next move will be to respect deletion in the SPARQL. Note that this is part way there, as r5 is not shown above because it is succeeded by r7 which is itself a delete operation on that triple.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]