topicmaps-comment message

Subject: Re: [xtm-wg] Topic Naming Constraint
From: "Steven R. Newcomb" <srn@coolheads.com>
To: xtm-wg@egroups.com
Date: Mon, 15 Jan 2001 14:23:33 -0600
[Kal:]
> I would like to express my concerns about and
> objection to the Topic Naming Constraint expressed in
> the XTM 1.0 specification. Having worked with both
> the ISO 13250 specification and XTM 1.0 specification
> and implemented programming libraries for both of
> these specifications, I find the topic naming
> constraint to be an unecessary restriction which
> makes the creation of consistent, mergeable topic
> maps exceedingly difficult in any but the most
> restricted situations. My objections are four-fold
> and I will attempt to express them here.

> 1. In my mind, the most important objection is that NAME
> and IDENTITY are two orthogonal concepts. There is no
> way in which a name should be construed as asserting
> identity.

I disagree.  Either names are meaningful, or they
aren't.  If names have nothing to do with identity, why
am I never called "Artur Rubinstein" or, for that
matter, "Species 8472"?  My point is that names have a
very great deal to do with identity, and identity has a
very great deal to do with topic merging.  In
informatics, the normal and primary purpose of naming a
thing is to allow it to be identified -- i.e.,
addressed -- unambiguously and reliably.

The primary design goal of topic maps was always to
make it possible to merge finding aids for corpora of
information.  You seem to be saying that names should
be excluded from being used in service of that goal,
because the discipline that the topic maps paradigm
imposes on the use of names (the topic naming
constraint) is not worth the trouble it causes.  I
disagree with that assessment.  The value of being able
to merge finding aids can scarcely be overstated; if we
lay aside all the hype, that value is the real,
revolutionary importance of topic maps.  If we weaken
the mergeability of topic maps, we beg the question,
"What can topic maps do for me that I can't already do
with plain-vanilla hyperlinks?"  If we take the
position that the specification of scopes is
unnecessary, or that it should be avoided, we have
nothing but a plain-vanilla hyperlink architecture.
With scope, the topic maps paradigm is *much* more
powerful than any hyperlink architecture.

I also think you're overstating the difficulties that
the topic naming constraint imposes on authors of topic
maps.  Yes, it creates work.  But it's useful and
important work that goes to the heart of the reasons
why people need the topic map paradigm.

> Both ISO 13250 and XTM 1.0 recognise the
> orthogonality of these two concepts by providing
> separate constructs for each.

This surprising interpretation does not jibe with the
history of the paradigm, or with the text of either
standard.

> Unfortunately the topic
> naming constraint then smashes the two concepts
> together again making a scoped name into a form of
> identity for a topic.

You are exactly right when you say that a scoped name
is a form of identity for a topic.  I would delete
"Unfortunately", and I'd change "smashes" to "amounts
to a careful and powerful articulation of".

The ONLY reason to merge topics is that they have the
same subject.  But if the names of subjects are
meaningful (i.e., if names are useful for identifying
things), then it is reasonable and appropriate to take
advantage of that fact.

First of all, what is the topic naming constraint?  It
is that no two topics can have the same name (basename)
in the same scope (topic namespace).  Why is that
important?  Because, otherwise, the names of topics
cannot be used to look them up directly; names do not
identify topics.  The usefulness of topic names -- and
topic maps -- would be severely compromised if topics
could not be unambiguously addressed by their names.
The importance of being able to address topics by their
names -- which hasn't been done very much as yet -- can
scarcely be overemphasized.  Kal, if the designs of
your applications are not yet taking advantage of the
topic naming constraint, I would urge you to think
about the bigger application problems that would be
insoluble without it.

> 2. For the creator of a topic map, the topic naming
> constraint requires apriori knowledge of the
> vocabulary of all topic maps with which the topic map
> being created is used. In a certain situations, this
> is possible - e.g.  the controlled vocabularies used
> by technical documentation departments; the
> controlled set of medical terms defined by
> WHO. However, in the general 'use-on-the-web'
> scenario, controlled vocabularies are not likely to
> prove practical and the topic naming constraint in
> effect restricts the author's freedom to name topics
> as he/she sees fit.

Not true.  Anybody can use any name for any topic, as
long as it's done *consistently* within the same topic
map.  (And people who can't face the discipline of
doing internally-consistent work are constitutionally
incapable of making useful topic maps in any case.)
There is a requirement that, when two different topics
(subjects) must have the same name, the scopes within
which they have those names must be distinct.  This
requirement is basic; it supports the process of 
determining which topic the end user wants:

  Directory assistance: "Which Mr. Smith do you want?
                        Do you want the one on High
                        Street, or the one who is not
                        on High Street?"

The purpose of scope is to support these distinctions.
"Living on High Street" is a topic that either is, or
is not, in the scope within which all Mr. Smiths have
their names.  The idea of changing the name of each
Mr. Smith is inconsistent with the requirements of the
real world.  I would be very surprised to find a
database, for example, that listed my name as "The
Steve Newcomb who lives on Flagler Court."  The idea of
regarding "The Steve Newcomb who lives on Flagler
Court" as a controlled-vocabulary term is absurd, and
it's the wrong way to think about topic names.  You
seem to be determined to avoid the use of scope for
making distinctions between names, but the fact is that
if you don't use scope for your names, you are bound to
have trouble.  Scope is fundamental to topic maps.

Your statement that "the topic naming constraint
requires apriori knowledge of the vocabulary of all
topic maps with which the topic map being created is
used" is simply not true.  When merging multiple topic
maps, it's trivially easy to distinguish the names
applied by various topic map documents (more precisely,
<topicMap> elements) from one another, so as to avoid
name clashes across topic map documents.  Avoiding such
clashes is the primary purpose of the scope-diddling
feature of <mergeMap>.

Finally, if you don't like the facilities that the
topic maps paradigm provides for names, you certainly
don't have to use them.  Just don't give your topics
any names, and, instead, you can define an occurrence
type (or an association type) for your "name-like
things" that has whatever application-defined semantics
you want it to have.  Such occurrences will not be
treated as names, and they will therefore not be
subject to the topic naming constraint.

> 3. The topic naming constraint requires that a user has
> access to the content of potential merged topic maps
> plus specialised topic map processing in order to
> determine if the creation or modification of a topic
> will cause a merge to take place.

How so?  I don't know of any basis in fact for this
statement.

> With subject-based
> merging, standard string manipulation tools will work
> if the user has access to the potential merged topic
> maps and using authoritative subject identities could
> even mitigate the need for access to the potential
> merged topic maps.

I don't know enough about your application context
to understand what you're saying here, so I can't
comment.

> 4. Translation becomes difficult. Not all translations
> between languages are one-to-one, two concepts with
> distinct names which are considered distinct in one
> language may be translated to a single name in
> another language. So a translation from one language
> to another may potential cause topic merging not
> intended by the creator of the source topic map.

Yes, translation is always difficult, but the
difficulty of translation is not caused by topic maps.
Topic maps merely demand precise translation, in order
to work properly.  This is a good thing, not a bad
thing.  (Unless, of course, you think it's a good thing
for the users of translations to be misled by the
translator's sloppiness or lack of awareness of nuances
in the target tongue.  Personally, I don't think that
it's a good thing when users of topic maps are misled
by them.  I think it's just great if topic map authors
are encouraged to be precise about what they say.  The
more precise, the better.)

> 5. Reification becomes problematic. It would be
> impossible for two reified topic map objects to share
> the same scoped name. For example I may wish to reify
> an occurrence of a topic which represents 'John
> Smith' and give it a name 'Photograph of John
> Smith'. This means that for any other topic about any
> other John Smith, I must be sure not to use the
> string 'Photograph of John Smith' to name an
> occurrence or else the a-nodes representing the
> *occurrences* (not the topics!) will be merged.

Not true.  It's very easy to avoid this name clash, in
the usual way, using the normal topic map scoping
facility: simply include the topic whose subject is the
appropriate John Smith in the scope of the name of the
reified occurrence.  Then, that occurrence cannot be
confused with a photograph of any other John Smith,
even if two photographic occurrences of two different
John Smiths have the name 'Photograph of John Smith'.

> For these reasons I propose the removal of the topic
> naming constraint from the XTM 1.0 processing model
> and urge the authoring group participating members to
> seriously consider and openly discuss this proposal.

We shall certainly discuss it, but I'm taking this
opportunity to share my opinion that this is a very bad
idea.  If adopted, this proposal will seriously weaken
the topic maps paradigm, and it will do so for no good
reason.  I say again: if you don't like the topic
naming constraint, which is the foundation of the
paradigm's topic naming facilities, then don't use the
topic naming facilities.  In other words, if you don't
want your names to be regarded as being useful for
identifying the topics of which they, uniquely, are the
names, then use the occurrence facilities or the
association facilities to express your topic names.
That's a perfectly valid thing to do, and it will leave
everyone who, unlike you, wants to take advantage of
the enormous identifying and merging power provided by
the topic naming constraint, in a position to continue
to do so.

-Steve

--
Steven R. Newcomb, Consultant
srn@coolheads.com

voice: +1 972 359 8160
fax:   +1 972 359 0270

405 Flagler Court
Allen, Texas 75013-2821 USA

To Post a message, send it to:   xtm-wg@eGroups.com

To Unsubscribe, send a blank message to: xtm-wg-unsubscribe@eGroups.com
Follow-Ups:
- Re: [xtm-wg] Topic Naming Constraint
  - From: Lars Marius Garshol <larsga@garshol.priv.no>
- RE: [xtm-wg] Topic Naming Constraint
  - From: "Kal Ahmed" <kal@ontopia.net>
- Re: [xtm-wg] Topic Naming Constraint
  - From: "Nikita Ogievetsky" <nogievet@cogx.com>
References:
- [xtm-wg] Topic Naming Constraint
  - From: "Kal Ahmed" <kal@ontopia.net>