[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: [Fwd: ODF Metadata]
Hi, Gary asked my to forward the following mail. Michael -------- Original Message -------- Subject: ODF Metadata Date: Sun, 18 Dec 2005 23:14:02 -0800 From: Gary Edwards <gary.edwards@OpenStack.us> To: Michael Brauer <Michael.Brauer@Sun.COM>, office@lists.oasis-open.org Hi ODF TC, Some of you might remember Bruce D'Arcus. When Duane Nickull (Adobe) first submitted the XMP metadata format for consideration, i contacted Bruce for help in understanding RDF, and the relationship between RDF and XML. Bruce then proceeded to take the issue to a group of XML::RDF experts for comment. Who can forget the fury, velocity, and sheer expertise of the discussion that followed? Anyway, Bruce has continued to wrestle with the issue in collaboration with Florian, Adobe, and many others. Much of this discussion - work has been taking place at the Foundation, and through direct eMail. Now that Bruce will be joining the ODF TC perhaps we can pull together the loose ends, even those that reach deep into the W3C. I've been asked to post this brief introduction to Bruce. Also included is a summary of the metadata project work. Hopefully his joining the TC will help us to pull the many ideas together, and come up with a solution that truly unlocks the enormous potential of XMP and ODF. I sense great things ahead, with Google in particular benefiting enormously from the ODF metadata work. Bruce is available for the Monday morning conference call, but i wonder if his Foundation membership application can be processed in time. Is there a contingency routine for this situation? Just wondering. ~ge~ ----- From Bruce D'Arcus ------------- Hi All, I'm in the process of getting signed up for the TC, but am not sure if it will be official or not for the Monday call. I'd very much like to get involved in the metadata discussion ASAP. Background ============ For those that don't know me, I am co-project lead for the OpenOffice Bibliographic Project. I am also a professional scholar, and originally came to this work because of frustration with existing tools (and, in retrospect, their incredibly limited metadata support). Since then, I have become an expert on the intersections of XML, metadata, and increasingly, RDF. I have been an active member of the XML metadata community around the Library of Congress, for example, where I not only learned a lot from library metadata experts, but also contributed towards the evolution of their Metadata Objects Description Schema (MODS). Likewise, I am part of a group sponsored by the Nature Publishing Group which consists mostly of people from the academic publishing industry. I also have some background with OpenDocument. I worked with Daniel Vogelheim on the proposal (approved by the TC last year) to dramatically improve the coding for citations in OpenDocument. That in many ways prefigures this metadata conversation, as that new citation coding consists of a pointer to external metadata. While we had not settled on what that metadata would look like, we had from the beginning assumed it would be stored apart from the content file in the wrapper. It just so happens that this fits perfectly the current metadata discussion. I feel I have, then, both the background and the practical use case (citations and bibliographic metadata) to help inform this discussion. On XMP, RDF and OpenDocument =========================== Phase 1 proposal ---------------- First, let me comment on the mapping proposal that Lars and Florian put together. I think it's in good shape, and the only issues to resolve in my mind are: 1) I wonder if meta:keyword should be deprecated in favor of dc:subject? Here's the definition of the latter: "Typically, Subject will be expressed as keywords, key phrases or classification codes that describe a topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme." The "recommended" practice is thus, in RDF, to do: <dc:subject rdf:resource=3D"http://example.net/subjects/Software"/= <http://example.net/subjects/Software%22/=> > You might then use SKOS, say, to richly describe those subjects. Indeed, that's what I'd do with my bib data. But the definition of dc:subject certainly doesn't preclude using string literals, and that seems better than using meta:keyword. 2) You can use Qualified DC to replace ODF-specific properties; both dcq:created and dcq:update. The reason why this is important to me long-term is not just to deprecate two elements in favor of more commonly used ones (though this is in itself important), but because DCQ adds most of what we need on top of DC for bibliographic metadata (see below). For example, the dcq:isPartOf relation is really, really critical to being able to model, say, journal articles or book chapters (or chapters in ODF!). See < http://www.dublincore.org/documents/2000/07/11/dcmes-qualifiers/> for more. 3) What's the purpose of "meta:initial-creator"? Just to mark who saved the file? 4) The big question: is it time to get rid of the user-defined elements? E.g. if you allow the sort of rich extensibility Adobe offers with XMP, then that would be a more robust solution than the generic property/value pairs (which are not identified with URIs) that you currently have. Perhaps it is not worth worrying about now, but rather just flag this as an issue, and decide it as part of phase 2? I think if you take care of the above, you're done. Phase 2/3 --------- Alan Lillich did a great job laying out the broader issues in his list post. First, let me point you to three responses to that post, from me, Leigh Dodds (who is a well-respected expert in both XML and RDF, and engineering manager at Ingenta, a major academic metadata and fulltext provider), and Bob DuCharme (similar background as Leigh; works for Lexis-Nexis): < http://www.ldodds.com/blog/archives/000263.html> < http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2005/12/09/ odf-and-xmp-comments <http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2005/12/09/odf-and-xmp-comments>> < http://www.snee.com/bobdc.blog/2005/12/using_or_not_using_adobes_xmp.html> For me the conclusion is really that rather than start with XMP, the TC should start with the existing metadata support, which is already very close to RDF. While I believe interoperability with XMP should be an important goal, I see it as a separate issue from what is best for OpenDocument. What ODF needs is actually fairly simple: 1) To broaden the metadata support beyond the level of the document, so that a consistent approach is used for all metadata needs in OpenDocument. I can provide the bibliographic use case, but there are others. 2) To deepen the metadata support to include: i. additional default support for all of Dublin Core, and part= s of Qualified Dublin Core ii. a mechanism for extension (that goes beyond the current user-defined fields) I've already written a RELAX NG schema that formalizes the above. It's not hard to do technically at the level of the XML schema [1]. This approach is sort of like training wheels for RDF, where -- recognizing that RDF tools are not yet as widely supported as XML tools -- you constrain the XML syntax so that it can be easily processed both by RDF and XML tools. [For an interesting discussion of issues with the RDF/XML syntax and tools, see Dan Brickley's post: < http://danbri.org/words/2005/09/28/137>] Also, ODF already has a solid packaging mechanism, so storage of the RDF metadata need only exploit that existing support, where one can indicate a metadata file by just using the text/xml+rdf mediatype. The above approach would add a lot of power to ODF, but with fairly minimal changes. The only other detail to work out is linking from document content to the RDF descriptions. Again, I don't think this will be that difficult, and the new citation coding already points the way to what that might look like. Finally, if the TC is interested, I am more than willing to work on a formal proposal over the next few weeks to present for consideration, and would be happy to work with others on this. Bruce [1] See two further blog posts of mine for how I was thinking about this a couple of months ago (the details have changed, but not the general approach): <http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2005/10/01/ opendocument-mixing-metadata <http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2005/10/01/opendocument-mixing-metadata>> < http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2005/10/30/ opendocument-and-rdf-storing-what-metadata-where <http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2005/10/30/opendocument-and-rdf-storing-what-metadata-where>>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]