OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: Re: [office] Metadata options


Somewhat later than I had hoped but a quick reply on Metadata options, including your #6 posted separately.

Philip Boutros wrote:


I thought I'd start the ball rolling on metadata in advance of Monday's call. Please forgive the "schema by example" nature of my examples. I would have presented the suggestions in a schema language (DTD, XSD, RelaxNG) but I'm not sure we've decided on one yet.

General thoughts
The metadata model presented in OpenOffice.org XML File Format 1.0 (OpenOffice) is somewhat specific to the OpenOffice applications and should probably be made more generic and flexible.

Option 1
Leave it alone.

OK, but of all the options, I think #2 is the best.

Option 2
Leave the existing predefined metadata (meta:generator, dc:creator, etc.) as they are but extend meta:user-defined so it can contain more than just text types. This might be done by typing the element name like this (Note: This example is not intended to be my suggestions for the actual tags and attributes we would use but rather to illustrate the concept):

<meta:user-defined-date name="checkin-date">2003-01-24T13:47:12</meta:user-defined-date>
<meta:user-defined-text name="foo">Some text</meta:user-defined-text>

Doing this would also require type extending text:user_defined to allow formatting of typed user-defined fields like this:

<text:user-defined-date text:name="checkin-date" text:date-adjust="123" text:fixed="true" style:data-style-name="ABC">01/24/03</text:user-defined-date >
<text:user-defined-text text:name="foo" text:fixed="true">Some text</text:user-defined-text>

There are probably other good ways to do this depending on which schema language we go with.

Preferred since it allows user to extend the metadata in a way would allow validation of the metadata, if I am reading the proposal correctly.

Option 3
Same as Option 2 but with the predefined metadata elements limited to only things we think EVERY document needs. For example meta:template is very specific to the OpenOffice application and should probably not exist in a more generic specification. The OpenOffice application could then add its template metadata it using meta:user-defined-text or meta:user-defined-url.

Not at all certain of the criteria for "EVERY document" anymore than how to build "an exhaustive set" as is mentioned in #4. Good example but are there any others that you see as "OpenOffice application" specific?

Option 4
Same as Option 2 but extend the predefined metadata elements to an exhaustive set that fits almost every user's needs. This would leave meta:user-defined as a very last resort.
As mentioned under #3, I am not sure how such a list could be generated nor what the criteria would be for inclusion. For example, biblical scholars would love to have hands (in the TEI sense), joins to other texts (if fragments), etc., but I don't know that predefined metadata elements for such purposes would be of common enough interest to justify their inclusion. Broad categories of metadata might reduce the number but the broader the categories the less useful the metadata.
Option 5
Move away from predefined metadata completely to a model like in Option 2 but for all metadata elements. This has the downside of moving us away from the Dublin core. On the positive side we would not have to deal with the difference between predefined metadata elements and user defined metadata elements, the schema would be more compact and tools to process instances of the schema would be simplified.
Disfavored. See reasoning under #6.

Does predefined vs. user raise the issue of RelaxNG being able to ignore elements not in its namespace? In other words, would it be useful to allow a default processing that handles the predefined metadata and optional processing of user defined metadata?

Phil also posted a 6th option in a separate post:
Option 6
Obviously if we want to move significantly away from the current OpenOffice spec we can look at the wholesale adoption of a metadata schema generated by some other body or creation of a totally new metadata definition from scratch.
(I replied to this one earlier but suspect it went only to Phil.)

While I would readily agree that there are things that could be done differently from Dublin Core, whether better or not is an open issue, but the utility of a metadata definition depends largely upon its widespread use. The Dublin Core metadata is clearly incomplete for biblical studies, yet we followed it in the OSIS schema (a joint bible encoding project of the American Bible Society and the Society of Biblical Literature). It was not for a lack of opinions or willingness to start over at the beginning.   But we realized that it would be better to have hundreds (now) and hopefully thousands (in the relatively near future) of bible texts with the minimal Dublin Core metadata (understood by the library community) than to have to build a consensus around a new metadata standard.

We retained the ability to extend the metadata for a document but that will always be layering upon the common Dublin Core metadata in every document.

I well understand the temptation to craft a better or more complete solution to problems but there are problems that have no complete or possibly even better solutions for all users. In those cases, my suggestion would be that we take a workable solution and provide for its extension.

Hope everyone is at the start of a great week! (Apologies for the lateness of the reply. I will try to do better in the future.)


Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Co-Editor, ISO Reference Model for Topic Maps

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC