topicmaps-comment message

Subject: [xtm-wg] Spec comments
From: Lars Marius Garshol <larsga@garshol.priv.no>
To: xtm-wg@yahoogroups.com
Date: 08 Feb 2001 10:52:14 +0100

General
=======

Throughout the specification the emphasis placed on subject indicators
and the computable identity of resources seems to indicate that it is
the resources used as indicators that are the basis for merging, when
it must in fact be the locators of the indicators.  This must be so
because it is utterly unreasonable to expect processors to download
all indicators and compare them to compute identity.  It might be
useful if the emphasis were shifted onto the locators instead, to make
this clearer.

The terms 'address' and 'locator' are nowhere defined, by the way,
even though they are absolutely central to the specification.


1.2 Goals
=========

Second para

  There are also references to RFCs 2119 and 1738 in the
  specification.  Section 4.1 should perhaps be moved into 1.3.


1.3 Terminology
===============

Since this whole section is in any case repeated in a different order
in section 2 I think it might as well be removed.  If anything should
happen to not be defined in section 2 the appropriate text could be
moved there.  Section 2 is far more readable, and so 1.3 does not seem
to serve any purpose, especially given the frequent hyperlinks from
the use of terms to their definitions.

Removing it would 

 - reduce bloat,
 - not change the meaning of the spec (unless there be bugs), 
 - make it less forbidding to read (since the spec would start with a
   gentle introduction, rather than a definition of terms in
   non-conceptual order), and,
 - reduce the chances of internal contradictions.


consistent topic map

  Defined as a topic map that has "one topic per subject and no
  further opportunities for merging or duplicate suppression, as
  defined in Annex F".

  However, as is noted elsewhere, the rules of Annex F are not
  sufficient to guarantee that there will be one topic per subject.
  The definition should be weakened to allow for this, especially
  since F.1 para 1 and 2.2.5.2 require processors to present a
  consistent topic map to the application, something that the
  processor under this definition cannot possibly be expected to
  achieve.


  The term is also used in B.9, but in what seems to be a weaker
  form (no mention of duplicate suppression):

  "It is possible for more than one Topic in the Topic Map to reify
  the same Subject. If no Subject is reified by more than one Topic in
  the Topic Map, then the Topic Map is said to be a Consistent Topic
  Map."

  I suggest that these two sentences simply be removed, as they do not
  say anything new in any case.


member

  Defined as "A topic that plays a role in an association.", which in
  2.2.4.1 becomes "A member is a topic (or set of topics) that plays a
  particular role in an association."  I believe both definitions
  should define a member as a non-empty set of topics.

  When compared with the definition of 'role' the result is also
  rather confusing. How can one distinguish between a role and a
  member, and of what use is that distinction? In my opinion the term
  'member' should be removed, but allowed to remain in the syntax.
  Even just to remove it from the definition of 'role' would help.


role

  The definitions here and in 2.2.4.2 are too brief to provide any
  real understanding of what a role really is.  The syntax seems to
  indicate that it is effectively specified by the role type, but this
  is nowhere stated.

  Also compare the definition with the usage in 2.2.4, para 1,
  sentence 2. The role-as-topic-characteristic seems awkward with this
  definition.


subject identity

  3. seems to not be a separate sense or meaning of the term, but
  additional information about 2.


topic characteristic

  Defined as a topic name, occurrence or role, where topic name is
  defined as a base name.  So what about variant names?

  Suggestion: mention variants under the definition of topic name.


topic map document

  Defined as "A document that contains one or more topic maps that
  conform to this specification. It may be serialized for the purpose
  of storage or interchange in a syntax governed by this or some other
  specification."

  The first sentence seems to contradict the second.  According to the
  first para of the abstract this specification defines a 'grammar',
  which is a syntax. (It is certainly not a model!) Hence, something
  that conforms to this specification must also conform to the syntax.

  The term 'document' to me implies that it must be an XML or SGML
  document, but this definition seems to imply that it can be a topic
  map structure in a database or some completely different
  serialization syntax. To me it would seem sensible to reserve some
  other term for some of these uses, such as 'topic map graph', which
  it would make sense for the object model/grove model to define.

  I assume that this term is intended to cover both XTM and 13250, and
  if so I suggest that it be abandoned in favour of 'XTM document' and
  that 'topic map document' remain an informal term. There is nothing
  formal to base it on in any case.

  I suggest that 'XTM document' be defined as "An XML document
  conforming to the syntax and other requirements in this
  specification."


topic map node

  Uses the terms 'object' and 'the system' with no explanation of what
  these are. Since this specification defines nothing but a grammar
  the term does not seem to be very useful. My suggestion is that it
  be left for the object model to define this.

  If so, 2.2.5.1 should also be removed, and the use of the term in
  2.2.5 para 1.


unconstrained scope

  A scope is a set of topics (as per the definition of scope), so what
  are the contents of this set? When filtering characteristics by
  scope in applications this will determine the outcome. The most
  reasonable alternative in my view is that it is the empty set.


2.2 Overview of topic maps
==========================

2.2.1, para 1

  "(also known as an topic types)"
                  ^^

2.2.1.3, para 1

  Could explain more clearly what happens when two topics are
  discovered to have the same subject, and also how this is
  established automatically. Something about what to do when it is not
  automatically discovered would also be nice.


2.2.1.3, para 2

  It can also be established through topic names, as per the TNC.


2.2.1.4, para 1

  "When two topics use the same resource to indicate their
  subject...".  How does an application know that the two resources
  are the same?  

  "...and must therefore be merged."  This is a bit vague.  Who must
  do this, how, and when?


2.2.1.6, para 2

  "Scope is considered to establish a namespace for topics."  It might
  be better to say 'topic base names' here, since the reader is
  otherwise lead to wonder about what happens with occurrences and
  roles. 

  "implicitly refer to the same subject and therefore should be
  merged."  In 2.2.1.4 merging 'must' happen, while here it only
  'should'.  Since this is a normative section, the term 'must' should
  be used here as well.


2.2.2

  Uses the term 'name' in several places where the formally defined
  term 'topic name' is most likely meant.


2.2.2.3

  The relationship between parameters and scope is left undefined, and
  the whole concept seems underspecified. What _is_ a processing
  context, and of what use are parameters for choosing between them?

  Also, is it allowable for an application to have a single 'context'
  which is used as the basis for choosing both the appropriate base
  name and the appropriate variant(s) of it?  Should one always choose
  one variant or many?  An




My comments after the point above were lost due to a disk crash on my
laptop this morning. The comments on annex F below were rewritten in a
hurry now. I will post the missing comments later if and when I
succeed in exhuming them from the wreck of my laptop.

Annex F
=======

Throughout the annex the undefined terms 'subject constituting
resource' and 'subject indicating resource' are used. These seem to be
the same as 'addressable subject' and 'subject indicator' in the spec
proper, and so the annex should be made consistent with the spec
proper.

Annex F must be made normative, since there are normative references
to it throughout sections 2, 3 and 4, which are themselves normative.


F.2.5

  Should mention explicitly that variants are ignored and catered for
  by F.6.2.


F.2.6

  Occurrences are equal if "the resource data values that are the
  occurrences are equal [Note : equality of the resource data value is
  determined by string equality.]".

  This is a bug, since it means that all occurrences of all topics
  that are being merged must be downloaded and compared for equality,
  which is too costly to even contemplate. Furthermore, the string
  equality principle as defined cannot be applied to binary data.


F.3.3

  This looks like an equality principle to me, and could be used to
  simplify F.2.4.


F.4.2

  This seems to be superfluous, since this constraint is already
  defined by the DTD.


F.5.1

  Perhaps F.4 should be called 'Operations' and this section moved
  there. F.5 could then be called 'Merge conditions'.

  Also, point 2 of the error conditions section is a bug.  To check
  this a processor will have to download every subject indicator,
  parse it as an XML document, build a topic map from it, locate the
  association inside it and do the comparison.  This is, again, too
  costly to contemplate.

  If this is not what is meant, this should be spelled out.

  Also, there seems to be no good reason to treat associations
  specially and not do the same for all other topic map constructs
  that might be reified and for which equality rules are defined.

  I think this should be removed.


F.6.3

  What is the para after the postcondition doing there?

--Lars M.


------------------------ Yahoo! Groups Sponsor ---------------------~-~>
eGroups is now Yahoo! Groups
Click here for more details
http://click.egroups.com/1/11231/0/_/337252/_/981625931/
---------------------------------------------------------------------_->

To Post a message, send it to:   xtm-wg@eGroups.com

To Unsubscribe, send a blank message to: xtm-wg-unsubscribe@eGroups.com
Follow-Ups:
- [xtm-wg] Lars Marius' comments
  - From: Steve Pepper <pepper@ontopia.net>