[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RDF metadata as an extension mechanism
Hi all, in this mail i'd like to share some thoughts on using RDF metadata as an extension mechanism. First, i would like to address an example that Doug Mahugh has posted: Doug Mahugh wrote: <quote> It's worth noting that the ODF metadata mechanisms don't allow for the use of a private/custom schema to tag content within a document. And that use case has value to many users. So if we decide that ODF won't be able to support those types of scenarios, for whatever reason, we should not be surprised to find that users who need such capabilities will look elsewhere. Consider the trivial example of a pre-existing document, created years ago, which needs to be logged in to a content management system that requires an abstract to be identified for each document. If the format of the document is HTML, then a div with class="abstract" can be used to tag the appropriate paragraph(s) as the abstract. If the format of the document is DOCX, a customXml element with element="abstract" can be used for the same purposes. In both cases the document content remains valid HTML or WordprocessingML, while the user adds the custom semantics required for their purpose. The custom semantics can be (and should be) ignored by others. The user is free to innovate quickly, and does not have to think in terms of a tradeoff between strict compliance and flexibility/business value. They can, and do, have the best of both worlds in such scenarios: strict compliance to a standard, and freedom to innovate quickly for their own specialized purposes. </quote> It seems quite simple to implement this with the current RDF metadata support, as specified by ODF 1.2. Basically, we need to identify the paragraph in content.xml that contains the abstract, and annotate it with an RDF property that expresses its, umm, "abstract-ness". We assume that our hypothetical CMS wants to keep all of the metadata that it is interested in in a separate RDF graph; at least, that is what i would recommend. This graph, stored in the stream "mymetafile.rdf" in the ODF package, contains the RDF statement which annotates the paragraph. In order for the CMS to find its RDF graph, we list it in the "manifest.rdf", and declare it to be of a user-defined type that the CMS understands. Spelling things out explicitly, this gives: Doug's example, implemented with RDF metadata: (RDF examples are in N3 syntax, because RDF/XML is ... unintuitive) in manifest.rdf: <mymetafile.rdf> rdf:type pkg:MetadataFile -- the file contains metadata <mymetafile.rdf> rdf:type myns:CMSAnnotations -- the file is of interest to my CMS in content.xml: <text:p xml:id="id42">In this treatise we discuss the fooness of bars.</text:p> in mymetafile.rdf: <content.xml#id42> myns:isAbstract xs:true -- identify the abstract Now, something like this SPARQL query gets the URI of the element containing the abstract from the RDF graphs: SELECT ?node WHERE { GRAPH <...baseURI.../manifest.rdf> { ?g rdf:type myns:CMSAnnotations } GRAPH ?g { ?node myns:isAbstract xs:true } } Imho, given that ODF already has a quite powerful mechanism for extension (RDF), any other proposed extension mechanisms should be closely scrutinized as to whether they actually add some expressive power, or merely add additional complexity by introducing different ways of doing the same thing. Or, even if they add some additional expressive power, whether that _addition_ is really worth the added complexity. Having just read a couple of the weblog articles that Doug has posted here in another mail about the customXml feature of OOXML, i do not see anything that would be obviously impossible to do with RDF metadata. The main difference between the two approaches is the data model: with customXml, the metadata is an XML tree, while with RDF, it is an RDF graph. Granted, some data is expressed more easily with a flexible graph, and other data more easily with a strict hierarchy, but i don't think that would be a deal-breaker either way. One interesting feature of customXml is the 2-way data binding, with the data to be bound specified by an XPath expression. i assume this binding mechanism is standardized, yes? In ODF we currently have text:meta-field, which is a field whose contents are given by RDF metadata, but we do not specify in any way how the content of such a field is generated from the metadata, or even from which _particular_ metadata, except for the prefix/suffix properties. (i do not know why that is so, because most of the RDF metadata stuff in ODF was designed before i got involved.) Maybe we could use a SPARQL query for bindings... So, it seems to me that customXml and RDF metadata aim to solve (with different mechanisms) problem areas that have significant overlap, so much that i would have serious doubts about having both in the same document format. (But of course, i am not sufficiently knowledgeable about customXml to be certain about this.) Furthermore: RDF metadata allow us to specify not only data that describes data (i.e. metadata), but we can iterate this another time to get data that describes data that describes data. Allow me to illustrate a potential solution to the (imho very serious) problem raised by Rob, namely, how can an application tell whether it is possible to copy the non-standard properties that are attached to a standard ODF element, and whether these non-standard properties may be invalidated by edits to the element's content. The solution is to define a couple of standard properties that can be used to describe user-defined properties. These descriptions (or meta-properties) must be put into a RDF/XML file that is referenced from the ODF document's manifest.rdf. my:property copyable [boolean] Assume an ODF processor copies an element for which a statement of the form <content.xml#id42> my:property "foo" exists in some RDF graph. If copyable is true, then copying that element will cause the inserted element to have an unique xml:id attribute, say id24, and the following statement inserted in the same RDF graph as the other statement: <content.xml#id24> my:property "foo" my:property isDigest [boolean] Assume an ODF processor modifies the content of an element for which a statement of the form <content.xml#id42> my:property "foo" exists in some RDF graph. If isDigest is true, then modifying that element will cause the statement to be removed (assuming that semantics of my:property is not understood by the processor in question, of course). I am not aware of a way to do the equivalent with arbitrary XML elements and attributes. Thus, i would claim that using RDF metadata as an extension mechanism has the potential for improving interoperability. Of course, these meta-properties i just made up here are currently not standardized. But the ODF TC has the power to do that, right? regards, michael (not the one you are used to, we've got more than one here :) ) -- Michael Stahl mailto:michael.stahl@sun.com http://www.sun.de OpenOffice.org/StarOffice Writer Sun Microsystems GmbH Nagelsweg 55, 20097 Hamburg, Germany ----------------------------------------------------------------------- Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten Amtsgericht Muenchen: HRB 161028 Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer Vorsitzender des Aufsichtsrates: Martin Haering
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]