office-metadata message

Subject: Re: [office-metadata] summarizing recent suggestions
From: Michael Brauer - Sun Germany - ham02 - Hamburg <Michael.Brauer@Sun.COM>
To: "Bruce D'Arcus" <bdarcus@gmail.com>
Date: Wed, 28 Feb 2007 10:04:17 +0100
Hi Bruce, Elias, Svante,

Bruce D'Arcus wrote:
> 
> On Feb 27, 2007, at 2:18 PM, Svante Schubert wrote:
> 
>> Bruce D'Arcus wrote:
>>>
> 
>>> 2) Get rid of the get and set field ideas, and instead just add a 
>>> single metadata field to display content and associate it with 
>>> resource descriptions. Suggested schema:
>>>
>>>   meta-field = element meta:field { xml-id, [insert generic ODF 
>>> content pattern] }
>>>
>>> Note: in this approach all of the field logic would be encoded in 
>>> RDF/XML. Let's call this option A.
>>>
>>> An alternative (let's call this option B) would be to encode some of 
>>> it in the field (what I had been thinking, though I have no strong 
>>> opinion either way).
>> You suggest to rename the earlier called "meta:text-get" to 
>> "meta:field" and you do not necessarily require RDFa to be specified 
>> for this field, correct?
> 
> Correct.

I have just noticed that the current suggestion all use the meta 
namespace. Since all other fields that we have are from the "text" 
namespace, I suggest that we use the "text" namespace for reasons of 
consistency for this new field, too. I'm sorry that I didn't notice that 
earlier.

For he same reason, I'm also not sure whether we should add the term 
"field" to the field's name. We currently do so only for the 
"user-field",  where "user-field" itself is a term used already by 
office applications. For all other fields, the element name just says 
something about the content or purpose of the field.

What about calling the field just "text:meta", or "text:metadata", or 
"text:metadata-label" (I think the term label was suggested by Bruce)? 
If the name shall contain the term "field", then "text:meta-field" would 
be an option.

My personal favorite actually is "text:metadata" or "text:metadata-label".


> 
>> How does the ODF application 'knows', who is 'responsible' for 
>> creating the content based on metadata for this field? In our case, 
>> how do we find the responsible plug-in?
>> The parsing of all RDF/XML streams seems not a good option from sight 
>> of an ODF application.
>> But RDFa or even better a further optional attribute (specifying the 
>> implementation) might give us a hint about the responsible plug-in and 
>> would be helpful.
> 
> I personally think the field should be typed in some way. E.g. something 
> like:
> 
> <meta:field xml:id="0874801373 
> field:type="http://ex.net/Contact";>foo</meta:field>

"Typing" it somehow is a good idea. The concept we have for this already 
is to use namespaced names (see for instance chart:class attribute 
described in section 10.2):

This would look like:

<meta:field xml:id="0874801373 xmlns:contact="http://ex.net";
field:type="contact:Contact">foo</meta:field>

For consistency reason I suggest that we reuse this concept, unless it 
would be inconsistent with other metadata standards, or otherwise 
inappropriate.

> 
> I think Elias proposed that be encoded in the RDF/XML.
> 
> I have to say for my citation field I'm a little nervous about leaving 
> all of the logic for the RDF/XML.

Me too. For two reason:

1. I believe that someone who implements let's say bibliographic support 
does not want to care about contact information, or any other metadata 
that a document may contain, and vice versa.

2. We shouldn't make much assumption how a field actually is updated 
(that is, how often the field value is recalculated and how), but we 
have to make sure that this can be done efficiently. I therefore think 
it should be possible for an application to figure out who (for instance 
what plug-in) may provide the field value from what is stored in the 
content.xml, and the plug-in should be able to get any additional data 
it required efficiently, too.

A type as suggested by Bruce seems to be a good solution for this. We 
may extend this by an optional URI that links to the RDF/XML stream that 
contains additional data (but that's only a suggestion).

In any case, a solution that requires that all RDF/XML streams are read 
to be able to update a field has the high risk that it introduces 
performance issues. Office applications for instance for performance 
reasons read images and embedded objects on demand only (that is, when 
they are displayed or edited). We should allow a similar behavior for 
metadata, too. A type plus maybe an IRI should allow that, but probably 
is not the only solution to this problem.

> 
> My alternative would be sometning like:
> 
> <field:field field:type="http://ex.net/Citation"; xml:id="0874801373">
>   <field:source>
>     <meta:link meta:resource="urn:isbn:98239809" cite:pages="23"/>
>     <meta:link meta:resource="http://ex.net/1"/>
>   </field:source>
>   <field:body>
>     (Doe, 1999: 23; Smith, 2004)
>   </field:body>
> </field:field>
> 
> I think it's just a practical matter how much the field should contain 
> to best enable document portability, including across file formats (say 
> OOXML; which looks more like the above).

It's an interesting idea. For other text fields, the field description 
itself contains all data that describes what is displayed, but not the 
value that is displayed. Your idea seems to go into that direction. On 
the other hand, for metadata we assume that specialized implementation 
provide that string that is displayed. The data that describes what is 
displayed therefore is of value only for this specialized 
implementation. I therefore could also image that we actually move it to 
the RDF/XML streams that contain the actual metadata. The only thing we 
have to make sure is that it is easy to actually locate that data (see 
my comment above).


>> In this context, you forgot to mention Elias comment about the 
>> datatype attribute. RDFa gets the content as a XMLLiteral not as a 
>> string. Elais offered the datatype="plaintext" to be able to receive 
>> only text from the ODF element. Any link on this, Elias?
> 
> Yes, I left that out, but agree it should be in, and would support 
> Elias' suggestion on the datatyping.

ODF has already a (limited) type support for strings, doubles, 
date/times and durations. See section 6.7.1. This support is based on 
xsd datatypes and already provides a support for data styles. If we add 
type support for metadata, I suggest that we define them based on what 
we have already (the current metadata draft actually says so already).

>>
>> Usually we offered in the specification an own element to emphasize 
>> such a scenario, therefore I still suggest an own element for the 
>> first scenario.
>> Although we would not define how an Office should handle such sensible 
>> data, but we would at least give the ODF application a chance to do it.
> 
> I think your word choice of "sensible" here is not quite right. Am not 
> quite sure what you're trying to say here.

Actually, I have no objections to allowing metadata attributes also for 
let's say paragraphs or other elements, provided that either all (in 
particular the about and property attributes) have to appear the 
simultaneously.

I only have a few concerns regarding attaching metadata attributes to 
<text:span> elements instead of defining a new element for metadata that 
appears within a paragraph. Why?

<text:span> currently is defined as follows:

"The <text:span> element represents portions of text that are attributed 
using a certain text style or class. The content of this element is the 
text that uses the text style."

That means, their purpose is to attach style information to text. This 
means that new <text:span> elements get added and may be removed if 
style information is changed. More important, for style information it 
actually does not matter how many <text:span> elements are used to 
attach a style to a piece of text. That is,

<text:span text:style-name="T1">Michael Brauer</text:span>

has the same semantic as

<text:span text:style-name="T1">Michael </text:span><text:span 
text:style-name="T1">Brauer</text:span>

This would be different if we add metadata attributes to <text:span> 
element. When doing so, we would alter the way <text:span> element are 
used. For this reason (and only for this reason), I would prefer to 
introduce a new element.


> 
> Bruce



Michael
Follow-Ups:
- Re: [office-metadata] summarizing recent suggestions
  - From: Bruce D'Arcus <bruce.darcus@OpenDocument.us>
References:
- summarizing recent suggestions
  - From: Bruce D'Arcus <bruce.darcus@OpenDocument.us>
- Re: [office-metadata] summarizing recent suggestions
  - From: Svante Schubert <Svante.Schubert@Sun.COM>
- Re: [office-metadata] summarizing recent suggestions
  - From: "Bruce D'Arcus" <bdarcus@gmail.com>