OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Multiple RDF files and the Single XML document


Dear ODF committee,

In my last comment I demonstrated that RDFa is unnecessary when you can store the metadata in RDF/XML files. Today I want to challenge some of the other paradigms.

Multiple RDF files

Let me first say that the assumption that ODF needs multiple RDF files in the container seems to be unexamined. I myself can’t find any benefits that outweigh the added complexity to tools.

There is the conflicting information argument. That you want to discriminate between different RDF files to filter out conflicting information. First, a Semantic Web search engine will not know which RDF files in the container are good, and which ones are bad. It will load them all or none. Second, I doubt this is an issue in real life because it is in the author’s interest to not create conflicting information. But, if the word processor hides RDF triples from you, because they live in an RDF file created by another plugin, then what hope is there to discover and resolve potentially conflicting information?

Then there is the loading time issue. It seems to me that it must be faster to open one file and parse it than opening multiple files. These are unlikely to be large files.

My preference would be to allow only a single meta.rdf file for reasons described below. An examination of the pros and cons could prove me wrong.

Single XML document

In ODF there has always been an emphasis on being able to create a single OpenDocument XML file. The four XML files; content.xml, styles.xml, meta.xml and settings.xml can be combined into one XML file, and there are no references to e.g. content.xml except in the the META-INF/manifest.xml. There is even a mechanism to embed images as base64 content.

My concern is that the RDF metadata is not part of the single OpenDocument XML file. Is this a nail in the coffin for the single XML file?

E.g. I want to add the <dc:publisher> metadata item to the document. Since the element is not allowed in meta.xml, I put it into meta.rdf. If the document is converted to a single XML file, this important information is lost.

When an ODF consumer today loads the four XML files it also merges the identifiers and does not keep track of the file they came from.

  1. If you use the <text:user-field-get text:name=”version”> in the page footer, (which is stored in styles.xml) and you use the same user field in the body text, (stored in content.xml), you refer to the same name – not content.xml#version or styles.xml#version.

  2. The same applies to <text:user-defined> You write text:name=”issn”, not text:name=”meta.xml#issn”.

My preference is that the same should apply for RDF.

As a consequence, it would not be possible to provide metadata for the four XML files, as they don’t exist when the document is stored as a single XML file. The current RDF proposal operates with a manifest.rdf file. It has information about the XML files, both references from pkg:hasPart predicates and as RDF objects (with corresponding rdf:types). These triples should be removed. The XML files are package artefacts – not parts of the document.

What is left in manifest.rdf? Basically metadata about the document, images and RDF objects (such as citations) used in the document. This is the reason I prefer the name meta.rdf over manifest.rdf.

The meta.rdf is conceptually a section of the single XML document representation. All metadata elements with rdf:about=”” become metadata for the document. I don’t have a complete solution for how to incorporate the meta.rdf file into the single XML file, but I could imagine something like this:

<office:document office:version=”1.2” office:mimetype=...>
<office:meta>...</office:meta>
<office:meta-rdf>...</office:meta-rdf>
<office:settings>...</office:settings>
...
</office:document>

Since it is not used when the document is stored in the zip-package, it has no effect on ODF 1.1 applications that only work with zip-packages.

There is however a special case that has to be dealt with when converting a zip-package to the single XML document: images and OLE objects. If an image has metadata attached, it will look like this in the meta.rdf file:

<rdf:Description rdf:about=”Pictures/11299100188.png”>
<dc:rights>Creative Commons BY-SA</dc:rights>
<dc:creator>Wikimedia</dc:creator>
</rdf:Description>

When the application converts to the single XML document it will embed the images in <office:binary-data> elements. To keep the association to the metadata, the about URL has to be rewritten as the image is no longer at the location. Either the <draw:image> or the <office:binary-data> element needs an xml:id or text:name attribute to be able to identify the item from the meta.rdf. The RDF would then be rewritten as:

<rdf:Description rdf:about=”#image8”>
<dc:rights>Creative Commons BY-SA</dc:rights>
<dc:creator>Wikimedia</dc:creator>
</rdf:Description>

That’s not all. If there are references to metadata of images these have to be changed as well

<text:meta-get rdf:resource=”Pictures/11299100188.png”
rdf:property=”...”>Wikimedia</text:meta-get>

changes to:

<text:meta-get rdf:resource=”#image8”
rdf:property=”...”>Wikimedia</text:meta-get>


Best regards


Søren Roug



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]