OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-metadata message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [office-metadata] "Logical/abstract" vs. "physical" representation


Hi Bruce,

Bruce D'Arcus wrote:
> 
> Michael,
> 
> I think we need to separate the question of document URI (the rdf:about 
> value for the document per se) from formal XML base URI. I am saying we 
> need the former, and that it needs to be unique and stable.

Yes, we should. But isn't a document URI assigned to documents from the 
outside? I do understand that documents need a stable and unique URI for 
certain metadata applications, but my impression is that defining that 
is outside the scope of our TC. We are defining a document format for 
office applications. We are not defining mechanism to identify 
documents. So, all we can do in my opinion is to assume that there may 
be a stable unique IRI, but we should not try to define what it looks like.

> 
> On Mar 5, 2007, at 11:33 AM, Michael Brauer - Sun Germany - ham02 - 
> Hamburg wrote:
> 
>>> The point is, for the metadata system, they are *always* required if 
>>> we want things to work reliably.
>>
>> I'm not sure about this. If I simply safe a document on my hard drive, 
>> why do I need a different base IRI in this case then the one the 
>> document gets because it is located somewhere on my hard drive? All 
>> statements I make in this document using relative IRIs are in the 
>> first place statements about certain ODF objects in exactly that 
>> document.
> 
> A URI is an identifier; nothing more. It allows us to refer to 
> resources; in this case documents and document fragments. If those URI 
> are not persistant and globally unique, they are not very useful.
> 
> Example 1:
> 
> I have a field in my document with the xml:id value of "123."
> 
> The document file path is "file:///foo.odf".
> 
> A triple for that field thus might be:
> 
> <file:///foo.odf#123> dc:description "blah" .
> 
> The RDF/XML thus looks like:
> 
> <rdf:Description rdf:about="file:///foo.odf#123">
>   <dc:description>blah</dc:description>
> </rdf:Description>
> 
> If you move the file, the statements are invalid.

Yes. But I thought what we store is the following:

<rdf:Description rdf:about="content.xml#123">
   <dc:description>blah</dc:description>
</rdf:Description>

Please note that I removed the path of the document from the IRI, and 
have added the path within the package. The IRI is relative now, but my 
understanding from what Elias said is that this is okay for the RDF-XML, 
as long as the IRI gets absolute in the RDF model.

By applying the usual rules for converting relative IRIs to absolute 
ones (as defined by RFC3986), the IRI is resolved to the following if 
the file location is file:///foo.odt:

file:///foo.odt/content.xml#123

If you move the file around, the IRI changes, but the statement you make 
remains valid.


> 
> Example 2:
> 
> I want to represent the relations between different documents. I want to 
> say file x.odf is a draft of file y.odf.
> 
> Triples, if there is no stable full URI:
> 
> <file:///y.odf> dcterms:isVersionOf <file:///x.odf> .
> 
> Again, what happens if the document moves? What happens if it moves to a 
> desktop where a user has the exact same file name and path, but it is 
> actually a different file?

For this use case you can use the IRI "." (that denotes the current 
document) as subject, but in fact you need a stable IRI for the object. 
But as I said above. It is my opinion not within the scope of our TC to 
define that. That's something that in my opinion operating system 
vendors, content management system vendors etc. have to define, but not us.

> 
> Example 3:
> 
> Rob wants to enable his use case of external annotation of files. Same 
> problem as above.

Yes and no. As for the location of the file itself, it is the same 
problem (Rob, what's your point of view: Is it within the scope of our 
TC to define stable IRIs for documents?) But I assume Rob does not only 
wants to annotate the document itself, but also objects within it. And 
that's where the relative IRIs of your first example could be used again.

> 
>> And if I make a document accessible on a web server, why is another 
>> base IRI required than the HTTP URI the document gets anyway. If this 
>> is sufficient for the HTML case, why isn't it sufficient for the ODF 
>> case?
>>
>> We have also to consider that a base IRI is applied to all realtive 
>> IRIs. That is, it may actually break non-metadata IRI. So, this 
>> feature has to be used with care.
> 
> As I said, we need a document URI. I'm saying it would be very bad 
> practice to simply leave this an optional designation and for everyone 
> to rely on file paths.

I do understand that, but still think its not within the scope of our 
TC. But let's assume we have that IRI. It would be stored somewhere 
within the document. How does an application that evaluates an RDF 
document that contains that IRI know to which document the IRI belongs? 
Does it have to search all documents for one that contains that IRI? Or 
does all environments that may store ODF document must implement our 
IRIs? So, its not only a question of the scope, but also an 
implementation question.

Michael
> 
> Whether those are used as formal xml:base values or not seems to me a 
> separate matter.
> 
> Bruce


-- 
Michael Brauer, Technical Architect Software Engineering
StarOffice/OpenOffice.org
Sun Microsystems GmbH             Nagelsweg 55
D-20097 Hamburg, Germany          michael.brauer@sun.com
http://sun.com/staroffice         +49 40 23646 500
http://blogs.sun.com/GullFOSS



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]