office-metadata message

Subject: Export / Import of metadata
From: Svante Schubert <Svante.Schubert@Sun.COM>
To: Elias Torres <eliast@us.ibm.com>, office-metadata <office-metadata@lists.oasis-open.org>
Date: Wed, 28 Feb 2007 18:58:30 +0100
Elias and all,

I was working on the summary of today's thread, but this question is too 
basic to be postponed.
If we allow RDFa as well on our xml:id ODF element set, we have still a 
problem to be solved..

Elias Torres wrote:
> Svante.Schubert@Sun.COM wrote on 02/27/2007 02:18:57 PM:
>
>   
>> ...
>>     
>>> 3) metadata attributes on certain content elements to encode their
>>> content as object literals. Those attributes are (I am ignoring the
>>> value and type stuff, but not deliberately excluding them):
>>>
>>>   meta-about = attribute meta:about { xsd:anyURI }
>>>   meta-property = attribute meta:property { xsd:anyURI }
>>>
>>> # not sure if we need this now, or how to use it
>>>   meta-resource = attribute meta:resource { xsd:anyURI }
>>>
>>> ... and the pattern would be:
>>>
>>>   meta-literal = meta-about, meta-property
>>>       
>> You suggest we should allow the usage of RDFa on our earlier set of
>> xml:id elements. In contrary to the usage of an own element - we called
>> it 'meta:text-set' - for those cases where the RDF vocabulary refers to
>> the contained string as a RDF Literal instead of the ODF element.
>>
>> In this context, you forgot to mention Elias comment about the datatype
>> attribute. RDFa gets the content as a XMLLiteral not as a string. Elais
>> offered the datatype="plaintext" to be able to receive only text from
>> the ODF element. Any link on this, Elias?
>>     
>
> http://www.w3.org/2006/07/SWD/RDFa/syntax/#id0x03f5b7b8
>
>   
>> In theory, someone might say we do not even need a meta:field. For
>> example, if there is a citation field generated by a citation plug-in,
>> this RDF application could give it's text portion (the citation) some
>> certain RDFa values and in theory this is well enough defined.
>>     
>
> That's correct.
>
>   
>> In practice we have several kinds of text portions with metadata in the
>> content, which are differently handled by an Office application.
>> For instance, if a text portion is generated from a RDF application, it
>> makes sense to clarify this, offering an own element. You proposed
>> meta:field aligned to the ODF field mechanism.
>> By an own element the ODF application can easily address this certain
>> scenario.
>>     
>
> I proposed meta:field with the intention of minimizing the number of
> field-xxx elements we have, but in reality, I don't think we need them.
> That's why I want RDFa attributes on most ODF elements so we can do what
> you said.
>
>   
>> And now I believe there is a further differentiation helpful for the
>>     
> Office:
>   
>> Take a look at the following example:
>>
>> ||<text:p rdf:about="http:/sun.employee/svanteschubert"
>> rdf:property="http:/ex.creditcard-no">5268 3851 2144 9898</text:p>
>>
>> and
>>
>> <text:p rdf:about="http:/ex.chapter" rdf:property="ex:introduction">It
>> was dark and stormy night.........many informations more.....</text:p>
>>
>> The first is some high sensible data, that will be most likely scanned
>> by further RDF applications for further process.
>> The latter on the other hand is simply a flag on the paragraph.
>> Categorizing the embedded text, which seems to me different.
>>
>> Usually we offered in the specification an own element to emphasize such
>> a scenario, therefore I still suggest an own element for the first
>>     
> scenario.
>   
>> Although we would not define how an Office should handle such sensible
>> data, but we would at least give the ODF application a chance to do it.
>>
>> Best regards,
>> Svante
>>
>>     
>
> I'm not sure really what the difference is between those fields. The
> "sensible" categorization is a bit subjective and we don't have
> requirements for it. However, I'm happy by the fact that you were able to
> express two different types of data according to your personal ontologies
> using RDFa and didn't need an extra element to do so. This is the kind of
> benefit that I see from the metadata proposal. I proposed meta:field (but
> if there's already an ODF field element, let's use that) in the case where
> we really need read-only data (edited via the ODF application or plug-in).
>   
>
We have the requirement:
"Metadata must be able to be processed, extracted, removed and so forth 
independently of the document content."

How do we fulfill the requirement of to extract/export metadata (which 
might be done using RDF/XML), when we have to differentiate the 
following use cases for RDFa:

   1. RDFa refers to the literal from the concatenated text nodes
   2. RDFa refers the XML subtree, which seems the default from RDF/XML
      sight using 'XMLLiteral'
   3. RDFa just gives information about the element it resides on - is
      this a use case for RDFa?

This becomes important during export of the metadata. In my example 
above the whole text of the introduction would be exported as metadata 
literal, which was not my intention. As I see the text of the 
introduction as the content not as metadata. All I intended to export is 
the RDF statement classifying the paragraph, for example by xml:id.

Would it appropriate to assume that RDFa always refers to the literal of 
all concatenated text nodes (case 1) , we do not support the XML subtree 
for now (case 2) and by xml:id reference we handle the element itself 
(case 3)? This would solve my problem above.

- Svante
Follow-Ups:
- Re: [office-metadata] Export / Import of metadata
  - From: Bruce D'Arcus <bruce.darcus@OpenDocument.us>
References:
- Re: [office-metadata] summarizing recent suggestions
  - From: Elias Torres <eliast@us.ibm.com>