tm-pubsubj-comment message

Subject: Re: [tm-pubsubj-comment] ISSUE 10 - PSIs "name" "subject" and"description"
From: Murray Altheim <m.altheim@open.ac.uk>
To: Steve Pepper <pepper@ontopia.net>
Date: Thu, 25 Apr 2002 15:10:41 +0000
Steve Pepper wrote:

> Lots of things here. Let's see if I can disentangle myself.
> 
> At 16:01 24/04/02 +0000, Murray Altheim wrote:
> 
>> You hit it on the nose. The PSI is in *topic map terms*. The only
>> reason to use Dublin Core at all is to hook into the DC semantics,
>> to allow non-TM tools a chance to play in the TM sandbox.
> 
> I agree in principle that we should hook into the DC semantics as much 
> as possible, but I don't yet have a clear idea exactly how that might be 
> leveraged in practice. I think I need concrete examples.
> 
>> From the DC element "subject":
>> >  Name:        Subject and Keywords
>> >  Identifier:  Subject
>> >  Definition:  The topic of the content of the resource.
>> >  Comment:     Typically, a Subject will be expressed as keywords,
>> >               key phrases or classification codes that describe a topic
>> >               of the resource.
>> >               Recommended best practice is to select a value from a
>> >               controlled vocabulary or formal classification scheme.
>>                 ^^^^^^^^^^^^^^^^^^^^^    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>> The key here is "controlled vocabulary." If we're defining things
>> in a PSI set, it might be a good idea to connect it via an existing
>> controlled vocabulary, such as one found in a library system. Any
>> tools that are DC element set aware would "understand" the dc:subject
>> as such and process accordingly. It seems a shame to go to all the
>> trouble to make XTM/PSI sets ISO 11179 compliant and then leave out
>> dc:subject. Or maybe I'm getting this all wrong.
> 
> To be honest I didn't know we were going to "all the trouble to make 
> XTM/PSI sets ISO 11179 compliant". I know next to nothing about ISO 
> 11179 (probably shouldn't admit that in public :) It's not that I doubt 
> you're right, but could you explain?


I brought this up several times during the original XTM discussions
(though I'm not complaining at all about anyone not remembering that),
such that ISO 11179 is a metadata standard, and insofar as TMs provide
hooks into metadata or themselves contain metadata, there should IMO
be hooks in XTM for ISO 11179. These "hooks" can simply be PSIs.

This isn't so complicated as it might seem. If you read through the
Dublin Core documentation you'll find that DC is itself an implementation
of ISO 11179. That's why the DC element set defines a number of elements
that are constants for all of DC, such as this quote from the DC element
description [DCMES]:
 >
 > Each Dublin Core element is defined using a set of ten attributes
 > from the ISO/IEC 11179 [ISO11179] standard for the description of
 > data elements. These include:
 >
 >    * Name - The label assigned to the data element
 >    * Identifier - The unique identifier assigned to the data element
 >    * Version - The version of the data element
 >    * Registration Authority - The entity authorised to register
 >      the data element
 >    * Language - The language in which the data element is specified
 >    * Definition - A statement that clearly represents the concept
 >      and essential nature of the data element
 >    * Obligation - Indicates if the data element is required to always
 >      or sometimes be present (contain a value)
 >    * Datatype - Indicates the type of data that can be represented
 >      in the value of the data element
 >    * Maximum Occurrence - Indicates any limit to the repeatability
 >      of the data element
 >    * Comment - A remark concerning the application of the data element
 >
 > Fortunately, six of the above ten attributes are common to all the
 > Dublin Core elements. These are, with their respective values:
[...]

So, if a topic map author happens to be French, perhaps speaking
no English at all, the author might create a typing topic such as
the following to canonically hook into the DC semantics:

    <topic id="sujet">
      <instanceOf>
        <topicRef xlink:href
      ="http://www.topicmaps.org/xtm/1.0/core.xtm#class"/>
      </instanceOf>
      <subjectIdentity>
        <topicRef xlink:href
      ="http://www.oasis-open.org/psi-sets/dces/dces.xtm#subject"/>
      </subjectIdentity>
      <baseName>
        <scope>
          <topicRef xlink:href
      ="http://www.topicmaps.org/xtm/1.0/language.xtm#FR"/>
        </scope>
        <baseNameString>sujet</baseNameString>
      </baseName>
     </topic>

This topic can then be used to scope base names according to
DC:subject, and could be subclassed for LoC or Dewey.

Steve continues:
> However, that's a side issue. My real problem is that I simply don't 
> understand how you think we would use dc:subject. That's why I asked 
> what a typical value might be.
> 
> Let me try and make it easier for you to help me. In the following 
> example of a piece of text used as a PSI and employing DC semantics, 
> what might go in the spot marked "*****"?
> 
>   Title:        Norway
>   Description:  Country in the Scandinavian peninsula bordering
>                 on Sweden, Finland, and Russia.
>   Identifier:   http://www.topicmaps.org/xtm/1.0/country.xtm#no
>   Subject:      *****
> 
>> Now, OTOH, I did a preliminary PSI set for DC last year (I think it
>> is included in that pile of stuff I posted to this group) which
>> created PSIs for each of the DC elements. Then, an author could
>> scope a base name with the dc:subject PSI. That's how I'd planned
>> to use it, though there certainly may be better ways.
> 
> Well, that's fine, but now you're talking about the author of a topic 
> map, right, not the author (or publisher) of a PSI set, most of whom 
> probably *won't* use topic maps to express the PSI set. It leads back to 
> the same question: What are the semantics (or purpose) of a dc:subject 
> property attached to a PSI?


Let me give two concrete, valuable examples (both hypothetical):

   1. The OCLC (Online Computer Library Center publishes WorldCat,
      its entire online library catalog. Were they to publish that
      electronic catalog as a topic map, they could express the
      subject(s) of each library resource as a scoped name, where
      the scope is "subject of this resource". But first, that
      would really be semantically incorrect since the subject
      identifier is not a name (though it would at least work
      technically), and second, there's no hook into the semantics
      yet of DC. This latter problem is a real problem because
      the topic used for the scope has no definition outside of
      the topic map realm. But defining a PSI for dc:subject
      would allow the PSI to define (canonically and unambiguously)
      the subject identifier within a very specific scope.
      Furthermore, dc:subject can be extended (within the DCES
      system) to delineate *which* catalog system the identifier
      belongs to (eg., US Library of Congress, Dewey (DDS), etc.)

   2. A software system, say, an online zoological taxonomic
      reference, wishes to use an ISO 11179-compliant scheme
      for describing the metadata they use on their extensive
      web site. This includes thousands of references to animals,
      plants, etc. and their online navigation system is backed
      up by a topic map system. So for both compatibility with
      non-topic map taxonomic systems, as well as for subject
      identifier compatibility with other topic maps (so that
      proper merging can take place), it's necessary that the
      identifiers be both (technically and descriptively) scoped
      correctly.

Finally, I don't see that you can create a set of DC descriptors
for a PSI set (which is a good idea), but not allow that facility
for PSIs themselves. I see the whole PSI concept as hierarchical
(taxonomical), such that one might create a system where you
could drill up or down in a PSI set, just like in say, a URN
scheme:

     urn:yahoo:Regional:Regions:Middle_East:Society_and_Culture:
         Issues_and_Causes:Human_Rights:Refugees:
         Palestinian_Refugees:Organizations

Now, wouldn't it be nice to have a cross reference from each of
the taxons in that URN to a subject identifier, with say, links
to the library catalog entries in Example #1? [hint: business
possibility here...] You might have a topic map PSI set at each
level (it'd be crazy not too).

---------------------
Note that there's also the DCMI Type Vocabulary for describing the
media type of resources, which should IMO be a PSI set too. Put it
this way: I'll be developing these regardless of whether anyone is
interested, as *I* need them. I'm kinda looking for (and prior to
a change in circumstance thought I'd be more deeply involved in)
a set of standards or recommendations for publication of both the
DCES and DCMI PSI sets. It's a pretty simple pair of topic maps,
and would take all of several hours to produce both, with subject
identity back to the original DC web pages.

Murray

[DCES]  http://dublincore.org/documents/dces/
......................................................................
Murray Altheim                  <http://kmi.open.ac.uk/people/murray/>
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK

      In the evening
      The rice leaves in the garden
      Rustle in the autumn wind
      That blows through my reed hut.  -- Minamoto no Tsunenobu
Follow-Ups:
- Re: [tm-pubsubj-comment] Relevance of dc:subject to PSIs
  - From: Steve Pepper <pepper@ontopia.net>
References:
- [tm-pubsubj-comment] ISSUE 10 - PSIs "name" "subject" and "description"
  - From: Steve Pepper <pepper@ontopia.net>
- Re: [tm-pubsubj-comment] ISSUE 10 - PSIs "name" "subject" and"description"
  - From: Steve Pepper <pepper@ontopia.net>
- Re: [tm-pubsubj-comment] ISSUE 10 - PSIs "name" "subject" and"description"
  - From: Steve Pepper <pepper@ontopia.net>