[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: [xtm-wg] Spec comments
General
=======
Throughout the specification the emphasis placed on subject indicators
and the computable identity of resources seems to indicate that it is
the resources used as indicators that are the basis for merging, when
it must in fact be the locators of the indicators. This must be so
because it is utterly unreasonable to expect processors to download
all indicators and compare them to compute identity. It might be
useful if the emphasis were shifted onto the locators instead, to make
this clearer.
The terms 'address' and 'locator' are nowhere defined, by the way,
even though they are absolutely central to the specification.
1.2 Goals
=========
Second para
There are also references to RFCs 2119 and 1738 in the
specification. Section 4.1 should perhaps be moved into 1.3.
1.3 Terminology
===============
Since this whole section is in any case repeated in a different order
in section 2 I think it might as well be removed. If anything should
happen to not be defined in section 2 the appropriate text could be
moved there. Section 2 is far more readable, and so 1.3 does not seem
to serve any purpose, especially given the frequent hyperlinks from
the use of terms to their definitions.
Removing it would
- reduce bloat,
- not change the meaning of the spec (unless there be bugs),
- make it less forbidding to read (since the spec would start with a
gentle introduction, rather than a definition of terms in
non-conceptual order), and,
- reduce the chances of internal contradictions.
consistent topic map
Defined as a topic map that has "one topic per subject and no
further opportunities for merging or duplicate suppression, as
defined in Annex F".
However, as is noted elsewhere, the rules of Annex F are not
sufficient to guarantee that there will be one topic per subject.
The definition should be weakened to allow for this, especially
since F.1 para 1 and 2.2.5.2 require processors to present a
consistent topic map to the application, something that the
processor under this definition cannot possibly be expected to
achieve.
The term is also used in B.9, but in what seems to be a weaker
form (no mention of duplicate suppression):
"It is possible for more than one Topic in the Topic Map to reify
the same Subject. If no Subject is reified by more than one Topic in
the Topic Map, then the Topic Map is said to be a Consistent Topic
Map."
I suggest that these two sentences simply be removed, as they do not
say anything new in any case.
member
Defined as "A topic that plays a role in an association.", which in
2.2.4.1 becomes "A member is a topic (or set of topics) that plays a
particular role in an association." I believe both definitions
should define a member as a non-empty set of topics.
When compared with the definition of 'role' the result is also
rather confusing. How can one distinguish between a role and a
member, and of what use is that distinction? In my opinion the term
'member' should be removed, but allowed to remain in the syntax.
Even just to remove it from the definition of 'role' would help.
role
The definitions here and in 2.2.4.2 are too brief to provide any
real understanding of what a role really is. The syntax seems to
indicate that it is effectively specified by the role type, but this
is nowhere stated.
Also compare the definition with the usage in 2.2.4, para 1,
sentence 2. The role-as-topic-characteristic seems awkward with this
definition.
subject identity
3. seems to not be a separate sense or meaning of the term, but
additional information about 2.
topic characteristic
Defined as a topic name, occurrence or role, where topic name is
defined as a base name. So what about variant names?
Suggestion: mention variants under the definition of topic name.
topic map document
Defined as "A document that contains one or more topic maps that
conform to this specification. It may be serialized for the purpose
of storage or interchange in a syntax governed by this or some other
specification."
The first sentence seems to contradict the second. According to the
first para of the abstract this specification defines a 'grammar',
which is a syntax. (It is certainly not a model!) Hence, something
that conforms to this specification must also conform to the syntax.
The term 'document' to me implies that it must be an XML or SGML
document, but this definition seems to imply that it can be a topic
map structure in a database or some completely different
serialization syntax. To me it would seem sensible to reserve some
other term for some of these uses, such as 'topic map graph', which
it would make sense for the object model/grove model to define.
I assume that this term is intended to cover both XTM and 13250, and
if so I suggest that it be abandoned in favour of 'XTM document' and
that 'topic map document' remain an informal term. There is nothing
formal to base it on in any case.
I suggest that 'XTM document' be defined as "An XML document
conforming to the syntax and other requirements in this
specification."
topic map node
Uses the terms 'object' and 'the system' with no explanation of what
these are. Since this specification defines nothing but a grammar
the term does not seem to be very useful. My suggestion is that it
be left for the object model to define this.
If so, 2.2.5.1 should also be removed, and the use of the term in
2.2.5 para 1.
unconstrained scope
A scope is a set of topics (as per the definition of scope), so what
are the contents of this set? When filtering characteristics by
scope in applications this will determine the outcome. The most
reasonable alternative in my view is that it is the empty set.
2.2 Overview of topic maps
==========================
2.2.1, para 1
"(also known as an topic types)"
^^
2.2.1.3, para 1
Could explain more clearly what happens when two topics are
discovered to have the same subject, and also how this is
established automatically. Something about what to do when it is not
automatically discovered would also be nice.
2.2.1.3, para 2
It can also be established through topic names, as per the TNC.
2.2.1.4, para 1
"When two topics use the same resource to indicate their
subject...". How does an application know that the two resources
are the same?
"...and must therefore be merged." This is a bit vague. Who must
do this, how, and when?
2.2.1.6, para 2
"Scope is considered to establish a namespace for topics." It might
be better to say 'topic base names' here, since the reader is
otherwise lead to wonder about what happens with occurrences and
roles.
"implicitly refer to the same subject and therefore should be
merged." In 2.2.1.4 merging 'must' happen, while here it only
'should'. Since this is a normative section, the term 'must' should
be used here as well.
2.2.2
Uses the term 'name' in several places where the formally defined
term 'topic name' is most likely meant.
2.2.2.3
The relationship between parameters and scope is left undefined, and
the whole concept seems underspecified. What _is_ a processing
context, and of what use are parameters for choosing between them?
Also, is it allowable for an application to have a single 'context'
which is used as the basis for choosing both the appropriate base
name and the appropriate variant(s) of it? Should one always choose
one variant or many? An
My comments after the point above were lost due to a disk crash on my
laptop this morning. The comments on annex F below were rewritten in a
hurry now. I will post the missing comments later if and when I
succeed in exhuming them from the wreck of my laptop.
Annex F
=======
Throughout the annex the undefined terms 'subject constituting
resource' and 'subject indicating resource' are used. These seem to be
the same as 'addressable subject' and 'subject indicator' in the spec
proper, and so the annex should be made consistent with the spec
proper.
Annex F must be made normative, since there are normative references
to it throughout sections 2, 3 and 4, which are themselves normative.
F.2.5
Should mention explicitly that variants are ignored and catered for
by F.6.2.
F.2.6
Occurrences are equal if "the resource data values that are the
occurrences are equal [Note : equality of the resource data value is
determined by string equality.]".
This is a bug, since it means that all occurrences of all topics
that are being merged must be downloaded and compared for equality,
which is too costly to even contemplate. Furthermore, the string
equality principle as defined cannot be applied to binary data.
F.3.3
This looks like an equality principle to me, and could be used to
simplify F.2.4.
F.4.2
This seems to be superfluous, since this constraint is already
defined by the DTD.
F.5.1
Perhaps F.4 should be called 'Operations' and this section moved
there. F.5 could then be called 'Merge conditions'.
Also, point 2 of the error conditions section is a bug. To check
this a processor will have to download every subject indicator,
parse it as an XML document, build a topic map from it, locate the
association inside it and do the comparison. This is, again, too
costly to contemplate.
If this is not what is meant, this should be spelled out.
Also, there seems to be no good reason to treat associations
specially and not do the same for all other topic map constructs
that might be reified and for which equality rules are defined.
I think this should be removed.
F.6.3
What is the para after the postcondition doing there?
--Lars M.
------------------------ Yahoo! Groups Sponsor ---------------------~-~>
eGroups is now Yahoo! Groups
Click here for more details
http://click.egroups.com/1/11231/0/_/337252/_/981625931/
---------------------------------------------------------------------_->
To Post a message, send it to: xtm-wg@eGroups.com
To Unsubscribe, send a blank message to: xtm-wg-unsubscribe@eGroups.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC