office-metadata message

Subject: Re: [office-metadata] RDFa model and xml:id
From: Patrick Durusau <patrick@durusau.net>
To: Elias Torres <eliast@us.ibm.com>
Date: Wed, 13 Dec 2006 10:43:33 -0500
Elias,

Elias Torres wrote:

>Patrick Durusau <patrick@durusau.net> wrote on 12/13/2006 08:09:40 AM:
>
>  
>
>>Greetings!
>>
>>I think there is some confusion on the RDFa "model" and xml:id. Mostly
>>from mixing two different issues, or at least what I see as two
>>different issues:
>>
>>1. Linking metadata and content: Purely a question of how to associate
>>metadata with content. Has no semantics other than simply linking one to
>>the other.
>>    
>>
>
>You are correct. If we only use xml:id or whichever "linking" mechanism
>between content and meta.xml we don't have semantics, except whatever is
>expressed in the meta.xml. However, we can go very far with the approach
>because that's how RDF works today. For example, I can assert any
>statements I want about any web page on the Internet through an out-of-band
>RDF file hosted on my site.
>
>  
>
>>2. The RDFa (or RDF) model: Therein are questions of identification of
>>subject, etc. But, I don't think those should be mixed with the purely
>>linking issue I pose as #1.
>>
>>Consider that if we follow RDF/RDFa with using URIs to identify
>>subjects, how do we distinguish those from ones simply meant as a
>>    
>>
>
>In HTML, RDFa solves this by the use of id and about. @id is used as a
>linking mechanism, @about is only used to identify the subject. If you
>encounter an @id in the DOM it means nothing to the RDFa extractor.
>
>  
>
>>linking mechanism? Moreover, if I point to an element, do I mean for the
>>element per se to be the subject or something that the element encloses?
>>    
>>
>
>I think you hit the nail right in the center of its head with this
>question. Let me elaborate. Bruce and I keep suggesting (not meaning to be
>a jerk or anything) that there are three (just to be thorough) very
>high-level non-RDF non-RDFa ways of adding metadata to ODF: all metadata is
>outside the content, inside the content or both.
>
>Bruce and I are proposing to do in both (hence RDFa) because of the
>following principles.
>
>  
>
Yes, unfortunately both. CF my earlier post on why metadata should be 
stored separately.

>- Independence, Modularity, Evolvability
>
>This one doesn't seem to be an issue, because I think everyone is
>comfortable with RDF, especially in the meta.xml
>  
>
This doesn't have any impact on the inline or out of line discussion.

>- In-context metadata
>
>I believe this one is very important because it gives us context if we know
>the location of the metadata. I mean that's easier for us to "move" around
>fragments of the document without losing its metadata than if we were to
>look around through many meta.xml, bibliography.xml, etc.xml files in the
>package and hoping to know what metadata applies to those fragments.
>
>  
>
Sorry, that doesn't follow even from your example. See below.

>Example:
>
><div id="party">
>      <div id="location1">....</div> ... <div id="starttime1"> </div> ...
><div id="description1">.... </div>
></div>
>
>meta.xml
>
><rdf:Description rdf:about="content.xml#location1">
>
># now imagine a very complex graph of attributes which is not currently
>defined how to extract a specific subset of triples that start from a
>giving subject. #
>
></rdf:Description>
>
>bibliography.xml
>
><rdf:Description rdf:about="content.xml#location1">
>
># now imagine another very complex graph of attributes which is not
>currently defined how to extract a specific subset of triples that start
>from a giving subject. #
>
></rdf:Description>
>
>Hopefully, you are able to see that if we were to remove that <div
>id="party"></div> entirely we would be leaving a lot of metadata around and
>worse completely disconnecting the two.
>
>  
>
So if I copy and paste:

<div id="location1">....</div> ... <div id="starttime1"> </div>

I haven't lost the context????

Looks to me like I have and so far I have seen no proposals to restrict 
copying of content that would preserve the inline metadata.

>- Don't Repeat Yourself
>
>This is the most important of all of RDFa principles and I think the one
>that applies the most to our conversation today. If we were to only use a
>linking mechanism, we are completely ignoring all of the metadata that
>already resides in the content.xml. RDF is binary in nature, meaning you
>can only relate two things, the minute you want to relate more, the RDF
>becomes a bit verbose and I'd rather avoid that. Let me explain.
>
>  
>
Only a consideration for hand authored contexts. SGML went way down the 
wrong path in trying to optimize for hand authoring and caused 
non-ending problems. I would rather not repeat those mistakes in ODF in 
the name of making hand authoring easy. It isn't even a consideration.

Let's assume content is duplicated. So what? My latest box has dual core 
processors and the effort to duplicate content is trivial.

My point being that RDFa has a different set of requirements than ODF 
and I don't think we should import principles that maybe sensible in one 
context into another. No sane person is going to hand author ODF documents.

>content.xml
>
><div id="party">
>      <div id="location1">123 First Ave</div> ... <div id="starttime1">
></div> ... <div id="description1">.... </div>
></div>
>
>meta.xml (in N3)
>
><content.xml#party> :location "123 First Avenue" .
>
>We have a slight problem when the metadata *already* exists in the content,
>we don't want to duplicate that data in the meta.xml. Our only option is to
>do everything by reference.
>
>  
>
The non-duplication of data is a presumption that is *not* shared and 
neither you nor Bruce have made a case for it. To this point I have 
heard, RDFa does it that way, which is fine by me, but not an argument 
that we should do it as well.

><content.xml#party> :location <content.xml#location1> .
>
>Unfortunately, I don't like the approach because in order for us to get to
>the content/metadata we have to know which RDF predicate are special,
>meaning de-referenced in order to get to the actual content. This approach
>is non-standard and would confused most RDF processors. In other words, if
>you point to a resource, you point to a resource, if you point to a
>literal, you have it right there.
>
>Another way to put it Patrick, is that if we want to do linking only, I
>believe we would have to do either or both: name *everything* and duplicate
>content around.
>
>On another note, I'm trying to understand the actual problems we might have
>with RDFa so I can try to address them. Is it a problem to add a handful of
>attributes to the current ODF schemas? I'd think that if we can xml:id, it
>shouldn't be a problem adding a few more.
>
>  
>
I don't know how much of a problem it would be to add attributes so I 
can't really answer that question. The fewer we add the better off we 
would be in my opinion.

What I am trying to suggest is that if we can do linking along the lines 
I have suggested, then sure, have RDF in the metadata. Or to put it 
another way, RDFa is designed for a different format than ODF and 
whatever its merits there, I don't think we should take it over 
wholesale for ODF.

Hope you are having a great day!

Patrick

>-Elias
>
>  
>
>>I think if we separate our discussions along those lines we will achieve
>>some clarity. Note that I am not presuming that we will reach agreement
>>but at least it will be clear where we disagree. Which is often the
>>first step towards reaching consensus. If we don't understand each other
>>there is little chance of resolving disagreements.
>>
>>Hope everyone is having a great day!
>>
>>Patrick
>>
>>--
>>Patrick Durusau
>>Patrick@Durusau.net
>>Chair, V1 - Text Processing: Office and Publishing Systems Interface
>>Co-Editor, ISO 13250, Topic Maps -- Reference Model
>>Member, Text Encoding Initiative Board of Directors, 2003-2005
>>
>>Topic Maps: Human, not artificial, intelligence at work!
>>
>>
>>    
>>
>
>
>
>
>  
>

-- 
Patrick Durusau
Patrick@Durusau.net
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Member, Text Encoding Initiative Board of Directors, 2003-2005

Topic Maps: Human, not artificial, intelligence at work!
Follow-Ups:
- Re: [office-metadata] RDFa model and xml:id
  - From: Bruce D'Arcus <bdarcus@gmail.com>
References:
- Re: [office-metadata] RDFa model and xml:id
  - From: Elias Torres <eliast@us.ibm.com>