4B8A6728.5000708@eea.europa.eu" type="cite">
Dear ODF committee,
In my last comment I demonstrated that RDFa is unnecessary when
you can store the metadata in RDF/XML files. Today I want to
challenge some of the other paradigms.
Multiple RDF files
Let me first say that the assumption that ODF needs multiple RDF
files in the container seems to be unexamined. I myself can’t find
any benefits that outweigh the added complexity to tools.
There is the conflicting information argument. That you want to
discriminate between different RDF files to filter out conflicting
information. First, a Semantic Web search engine will not know which
RDF files in the container are good, and which ones are bad. It will
load them all or none. Second, I doubt this is an issue in real life
because it is in the author’s interest to not create conflicting
information. But, if the word processor hides RDF triples from you,
because they live in an RDF file created by another plugin, then what
hope is there to discover and resolve potentially conflicting
information?
Then there is the loading time issue. It seems to me that it must
be faster to open one file
and parse it than opening multiple files. These are unlikely to be
large files.
My preference would be to allow only a
single meta.rdf file for reasons described below. An examination of
the pros and cons could prove me wrong.
Single XML document
In ODF there has always been an emphasis on being able to create a
single OpenDocument XML file. The four XML files; content.xml,
styles.xml, meta.xml and settings.xml can be combined into one XML
file, and there are no references to e.g. content.xml except in the
the META-INF/manifest.xml. There is even a mechanism to embed images
as base64 content.
My concern is that the RDF metadata is not part of the single
OpenDocument XML file. Is this a nail in the coffin for the single
XML file?
E.g. I want to add the <dc:publisher> metadata item to the
document. Since the element is not allowed in meta.xml, I put it into
meta.rdf. If the document is converted to a single XML file, this
important information is lost.
When an ODF consumer today loads the four XML files it also merges
the identifiers and does not keep track of the file they came from.
-
If you use the <text:user-field-get text:name=”version”>
in the page footer, (which is stored in styles.xml) and you use the
same user field in the body text, (stored in content.xml), you refer to
the same name – not content.xml#version or styles.xml#version.
-
The same applies to <text:user-defined> You write
text:name=”issn”, not text:name=”meta.xml#issn”.
My preference is that the same should apply for RDF.
As a consequence, it would not be possible to provide metadata for
the four XML files, as they don’t exist when the document is stored
as a single XML file. The current RDF proposal operates with a
manifest.rdf file. It has information about the XML files, both
references from pkg:hasPart predicates and as RDF objects (with
corresponding rdf:types). These triples should be removed. The XML
files are package artefacts – not parts of the document.
What is left in manifest.rdf? Basically metadata about the
document, images and RDF objects (such as citations) used in the
document. This is the reason I prefer the name meta.rdf over
manifest.rdf.
The meta.rdf is conceptually a
section of the single XML document representation. All metadata
elements with rdf:about=”” become metadata for the document. I
don’t have a complete solution for how to incorporate the meta.rdf
file into the single XML file, but I could imagine something like
this:
<office:document
office:version=”1.2” office:mimetype=...>
<office:meta>...</office:meta>
<office:meta-rdf>...</office:meta-rdf>
<office:settings>...</office:settings>
...
</office:document>
Since it is not used when the
document is stored in the zip-package, it has no effect on ODF 1.1
applications that only work with zip-packages.
There is however a special case
that has to be dealt with when converting a zip-package to the single
XML document: images and OLE objects. If an image has metadata
attached, it will look
like this in the meta.rdf file:
<rdf:Description
rdf:about=”Pictures/11299100188.png”>
<dc:rights>Creative
Commons BY-SA</dc:rights>
<dc:creator>Wikimedia</dc:creator>
</rdf:Description>
When the application converts to
the single XML document it will embed the images in
<office:binary-data> elements. To keep the association to the
metadata, the about URL has to be rewritten as the image is no longer
at the location. Either the <draw:image> or the
<office:binary-data> element needs an xml:id or text:name
attribute to be able to identify the item from the meta.rdf. The RDF
would then be rewritten as:
<rdf:Description
rdf:about=”#image8”>
<dc:rights>Creative Commons
BY-SA</dc:rights>
<dc:creator>Wikimedia</dc:creator>
</rdf:Description>
That’s not all. If there are
references to metadata of images these have to be changed as well
<text:meta-get
rdf:resource=”Pictures/11299100188.png”
rdf:property=”...”>Wikimedia</text:meta-get>
changes to:
<text:meta-get
rdf:resource=”#image8”
rdf:property=”...”>Wikimedia</text:meta-get>
Best regards
Søren Roug