[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: MeSH example, extensibility
Patrick raised the issue of MeSH today for subject tagging of documents, so I looked into it a little more. Here's what I found ... MeSH subject identifiers do have some sort of ID (not sure if they are represented as URIs also), and there is in fact a DCTERMS property specifically for encoding it. See here, towards the bottom: <http://dublincore.org/documents/dcmi-terms/> This seems relevant too: <http://www.nlm.nih.gov/tsd/cataloging/metafilenew.html> E.g. a MeSH subject heading could presumably be represented using something like this: <dcterms:mesh>D000005</dcterms:mesh> Not sure I have the details right (that id looks wrong for example), but you get the idea. The above could then be included in the document meta.xml file for indexing (and is, coincidentally, a valid RDF property). For purposes of indexing and search, then, that would go a long way without even needing to embed any source. It would work quite elegantly with the approach I've been advocating. I can see the same thing using the RDF-based SKOS vocabulary. <http://www.xml.com/pub/a/2005/06/22/skos.html> In that case, I'd imagine also that the source descriptions would likely not typically get embedded in the ODF file (though of course *could*), but the identifiers would be the critical bit. To be clear, then, *this* is what I mean by extensibility: the ability to add foreign properties to a resource description that correspond to a common model so that *tools know what they are.* It would be absolutely counterproductive in my view to allow developers to throw out the ODF metadata entirely and use their own schema: a kind of all-or-nothing view of extensibility. It is NOT, then, a notion of documents and schemas, and really gets to Florian's discussion of the abstract API. E.g. if we think of the model in oo terms, then a document object (citation, table, image, etc.) gets described by something like a Resource object, which includes: - a (optional) uri identifier - one or more (optional) types - an array of Property objects There are then two kinds of Property objects: - Literal (the simple property-value case above) - linked Resource This is simple, but very powerful. If we don't use a common model (e.g. "rules for extension"), then we'll paint ODF into a rather tight corner, and we'd introduce more problems than we'd solve. Bruce PS - I managed to write an XSLT in about 30 minutes that converted a MeSH example I found to this valid RDF. Really not hard. <DescriptorRecord xmlns="http://nih.org/mesh/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" rdf:ID="D000005"> <descriptorName>Abdomen</descriptorName> <annotation>region & abdominal organs</annotation> <preferredConcept> <Concept> <conceptUI>M0000005</conceptUI> <conceptName>Abdomen</conceptName> <scopeNote>That portion of the body that lies between the thorax and the pelvis.</scopeNote> <term> <Term> <print>Y</print> <termUI>T000012</termUI> <value>Abdomen</value> <dateCreated>1999-01-01</dateCreated> </Term> </term> <permutedTerm> <Term> <lexicalTag>NON</lexicalTag> <termUI>T000012</termUI> <value>Abdomens</value> </Term> </permutedTerm> </Concept> </preferredConcept> </DescriptorRecord>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]