office-metadata message

Subject: Re: [office-metadata] Focus on model
From: Elias Torres <eliast@us.ibm.com>
To: Lars Oppermann <Lars.Oppermann@Sun.COM>
Date: Fri, 15 Dec 2006 14:43:11 -0500
Lars.Oppermann@Sun.COM wrote on 12/15/2006 11:00:11 AM:

> Hello Elias,
>
> Some more thoughts from me...
>
> So it looks, like for the indexing applications, which want to store
> information about documents, there needs to be a defined 'function',
> that describes how to extract the meta-data-model from a given ODF
> document, so that statements made about specific parts of the document
> can be related back to the document as well as the specific parts which
> they relate to. The syntax, by which this model is represented inside
> the document is largely (if not at all) irrelevant, as long as a
> function exists, that can extract it. This function should also be
> implementable with reasonable effort on a wide range of platforms.

The syntax might be irrelevant in the context of your statements and the
goals of the TC. However, in order to properly specify the function that
extracts metadata from an ODF package and make it implementable with
reasonable effort on a wide range of platforms, syntax *does* matter. In
other words, at the end of the day, we do have to leave the high-level
discussion and define the functions and have an idea of how easy it's to
implement.

>
> I can imagine both, a function that does the 'special' resource
> resolution which I used in my RDF/XML examples as well as another
> function, which pulls resources that are marked as such directly from
> the content. Both functions would have the same result.

That's correct, same result. Rob Weir calls the logical model as opposed to
the physical (e.g. the data model vs its serialization).

>
> Now, if we approach the matter from the generation side rather than from
> the consumption side, it should also be easy to a) create the meta-data
> content while working on the content and b) preserve the meta-data's
> integrity when the content is modified.

Absolutely.

>
> It looks to me, like all approaches which we looked at seem equally
> suited to defining a function that pulls the meta-data from the document.

I'm not sure I'd agree with the word equally. I have highlighted issues
with the 'special' resource solution and the burden it imposes on the
implementors specially on the extraction. I'm also hoping to hear any
specific objections to the hybrid approach so we can compare. In essence,
I'm not sure it's a matter of tossing a coin and picking the outcome.

>
> Is this understanding consistent with how you other think about this
> matter, or are there other aspects that should have an influence on this?

I think copy and pasteability could be included into it, but the fact is
that all the use cases can be implemented with either approach, except that
I believe we are imposing unduly processing requirements if we exclude
metadata from the content. I have a demo schedule for the next meeting that
will hopefully show a scenario hinting at how much processing could be
involved in either of the approaches.

-Elias

>
> Bests,
> Lars
>
References:
- Re: [office-metadata] Focus on model
  - From: Lars Oppermann <Lars.Oppermann@Sun.COM>