office message

Subject: metadata (OpenDocument TC Meeting Minutes ...)

From: Bruce D'Arcus <bruce.darcus@OpenDocument.us>
To: office@lists.oasis-open.org
Date: Wed, 11 Jan 2006 13:23:04 -0500

Hi All,

I wasn't able to attend this meeting (the administrative details 
weren't yet resolved), but wanted to comment on this, since it's my 
primary reason for joining the TC:

On Jan 11, 2006, at 12:26 PM, Lars Oppermann wrote:

> Meta Data:

[...]

> This mapping should also conform to the subset of the RDF data model 
> that XMP is using
> The mapping should map some of the meta:-namespace properties to dcq 
> properties if there are appropriate ones available.

These two are not consistent, as XMP notably does not use DCQ. For 
example xmp:dateCreated vs. dcq:created, xmp:dateModified vs. 
dcq:modified, etc.

Also, as I say in a couple places below, I'm not convinced XMP should 
be the framework here. It's enough at this stage to just say that we're 
dealing with the property mapping (a different issue than the model or 
syntax).

> The second phase will address the extensibility of the Open Document 
> meta data facility.
> Michael will collect requirements for this second phase.
> Phase 2 should enable application developers, to store those meta data 
> attributes that are needed by their users.

I personally think phase 2 and 3 ought to be reversed in order, as the 
technical details here are more complex (and contentious).

Also, I think the TC ought to take a broad view of what "application 
developers" means in the context. I would very much like enhanced 
metadata support, for example, to open up opportunities for third-party 
developers.

> The first two phases do not cover meta data statements on specific 
> parts of a document (e.g. individual paragraphs). This will be covered 
> in a third phase.
> It is not yet clear to which kinds of objects, metadata should be 
> attachable. One might think of XML elements or specific objects, that 
> have meaning in the context of a specific document, e.g. paragraphs in 
> text documents. Also for this phase, requirements are still needed.

I say paragraph level (or below) metadata for content might well be a 
phase 4 issue.

When I refer to parts, I am referring to objects embedded in documents; 
images or tables or figures or citations inserted in a text document, 
for example. This is basically a level above paragraph-level content.

If I create a diagram or figure in one application, and embed metadata 
in that file, and then add that to an ODF text document, that metadata 
should travel with it. It cannot now, because ODF only has a notion of 
document-level metadata.

When we create documents, they are not simple standalone silos. They 
bring together content from different places. That content all has 
metadata (whether implicit or explicit) associated with it: where it 
came from, who created it, what kind of use restrictions are attached 
to it in some case, etc. In a lot of cases, this metadata is important.

With OpenDocument, we have a unique opportunity to put the 
infrastructure in place to formalize that in a consistent, but 
flexible, way.

My proposal is pretty simple:

1)  Objects that can be described by enhanced metadata: documents, 
images, tables, bibliographic references, figures (I may be missing a 
few more?).

2)  Where to store the metadata descriptions: outside of content.xml in 
the file wrapper; e.g.:

metadata/
	document.xml
	bibliography.xml
	figures.xml

Give them a mediatype of text/xml+rdf to distinguish them as metadata 
files.

3)  Associating content with metadata: a simple attribute on the 
content that points to a node in the RDF.

Basically, we can think of enhancing the capabilities of text fields a 
bit, by allowing them to point to these RDF descriptions, rather than 
just rely on attribute strings. Beyond the new citation coding (which 
is a kind of text field, if rather complex), captions are another 
obvious area where this could be useful. Likewise for tables of figures 
and such.

The reason why I say phase 2 will be difficult is that I don't think 
XMP is acceptable as is. ODF has needs (and opportunities) for which I 
don't thinK XMP was designed.  I'd rather not get wrapped up debating 
these technical details and lose the forest for the trees (that we need 
to expand metadata beyond the document).

Finally, I did want to mention (as I did to Florian off-list a little 
while ago) that enhancing metadata support in OpenDocument is also 
likely to further the accessibility goal.

Bruce

Follow-Ups:
- Re: [office] metadata and XMP (was: [office] metadata (OpenDocument TCMeeting Minutes ...))
  - From: Lars Oppermann <Lars.Oppermann@Sun.COM>

References:
- OpenDocument TC Meeting Minutes 2006-01-09 and 2005-12-19
  - From: Lars Oppermann <Lars.Oppermann@Sun.COM>