[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: [xtm-wg] Spec comments
General ======= Throughout the specification the emphasis placed on subject indicators and the computable identity of resources seems to indicate that it is the resources used as indicators that are the basis for merging, when it must in fact be the locators of the indicators. This must be so because it is utterly unreasonable to expect processors to download all indicators and compare them to compute identity. It might be useful if the emphasis were shifted onto the locators instead, to make this clearer. The terms 'address' and 'locator' are nowhere defined, by the way, even though they are absolutely central to the specification. 1.2 Goals ========= Second para There are also references to RFCs 2119 and 1738 in the specification. Section 4.1 should perhaps be moved into 1.3. 1.3 Terminology =============== Since this whole section is in any case repeated in a different order in section 2 I think it might as well be removed. If anything should happen to not be defined in section 2 the appropriate text could be moved there. Section 2 is far more readable, and so 1.3 does not seem to serve any purpose, especially given the frequent hyperlinks from the use of terms to their definitions. Removing it would - reduce bloat, - not change the meaning of the spec (unless there be bugs), - make it less forbidding to read (since the spec would start with a gentle introduction, rather than a definition of terms in non-conceptual order), and, - reduce the chances of internal contradictions. consistent topic map Defined as a topic map that has "one topic per subject and no further opportunities for merging or duplicate suppression, as defined in Annex F". However, as is noted elsewhere, the rules of Annex F are not sufficient to guarantee that there will be one topic per subject. The definition should be weakened to allow for this, especially since F.1 para 1 and 2.2.5.2 require processors to present a consistent topic map to the application, something that the processor under this definition cannot possibly be expected to achieve. The term is also used in B.9, but in what seems to be a weaker form (no mention of duplicate suppression): "It is possible for more than one Topic in the Topic Map to reify the same Subject. If no Subject is reified by more than one Topic in the Topic Map, then the Topic Map is said to be a Consistent Topic Map." I suggest that these two sentences simply be removed, as they do not say anything new in any case. member Defined as "A topic that plays a role in an association.", which in 2.2.4.1 becomes "A member is a topic (or set of topics) that plays a particular role in an association." I believe both definitions should define a member as a non-empty set of topics. When compared with the definition of 'role' the result is also rather confusing. How can one distinguish between a role and a member, and of what use is that distinction? In my opinion the term 'member' should be removed, but allowed to remain in the syntax. Even just to remove it from the definition of 'role' would help. role The definitions here and in 2.2.4.2 are too brief to provide any real understanding of what a role really is. The syntax seems to indicate that it is effectively specified by the role type, but this is nowhere stated. Also compare the definition with the usage in 2.2.4, para 1, sentence 2. The role-as-topic-characteristic seems awkward with this definition. subject identity 3. seems to not be a separate sense or meaning of the term, but additional information about 2. topic characteristic Defined as a topic name, occurrence or role, where topic name is defined as a base name. So what about variant names? Suggestion: mention variants under the definition of topic name. topic map document Defined as "A document that contains one or more topic maps that conform to this specification. It may be serialized for the purpose of storage or interchange in a syntax governed by this or some other specification." The first sentence seems to contradict the second. According to the first para of the abstract this specification defines a 'grammar', which is a syntax. (It is certainly not a model!) Hence, something that conforms to this specification must also conform to the syntax. The term 'document' to me implies that it must be an XML or SGML document, but this definition seems to imply that it can be a topic map structure in a database or some completely different serialization syntax. To me it would seem sensible to reserve some other term for some of these uses, such as 'topic map graph', which it would make sense for the object model/grove model to define. I assume that this term is intended to cover both XTM and 13250, and if so I suggest that it be abandoned in favour of 'XTM document' and that 'topic map document' remain an informal term. There is nothing formal to base it on in any case. I suggest that 'XTM document' be defined as "An XML document conforming to the syntax and other requirements in this specification." topic map node Uses the terms 'object' and 'the system' with no explanation of what these are. Since this specification defines nothing but a grammar the term does not seem to be very useful. My suggestion is that it be left for the object model to define this. If so, 2.2.5.1 should also be removed, and the use of the term in 2.2.5 para 1. unconstrained scope A scope is a set of topics (as per the definition of scope), so what are the contents of this set? When filtering characteristics by scope in applications this will determine the outcome. The most reasonable alternative in my view is that it is the empty set. 2.2 Overview of topic maps ========================== 2.2.1, para 1 "(also known as an topic types)" ^^ 2.2.1.3, para 1 Could explain more clearly what happens when two topics are discovered to have the same subject, and also how this is established automatically. Something about what to do when it is not automatically discovered would also be nice. 2.2.1.3, para 2 It can also be established through topic names, as per the TNC. 2.2.1.4, para 1 "When two topics use the same resource to indicate their subject...". How does an application know that the two resources are the same? "...and must therefore be merged." This is a bit vague. Who must do this, how, and when? 2.2.1.6, para 2 "Scope is considered to establish a namespace for topics." It might be better to say 'topic base names' here, since the reader is otherwise lead to wonder about what happens with occurrences and roles. "implicitly refer to the same subject and therefore should be merged." In 2.2.1.4 merging 'must' happen, while here it only 'should'. Since this is a normative section, the term 'must' should be used here as well. 2.2.2 Uses the term 'name' in several places where the formally defined term 'topic name' is most likely meant. 2.2.2.3 The relationship between parameters and scope is left undefined, and the whole concept seems underspecified. What _is_ a processing context, and of what use are parameters for choosing between them? Also, is it allowable for an application to have a single 'context' which is used as the basis for choosing both the appropriate base name and the appropriate variant(s) of it? Should one always choose one variant or many? An My comments after the point above were lost due to a disk crash on my laptop this morning. The comments on annex F below were rewritten in a hurry now. I will post the missing comments later if and when I succeed in exhuming them from the wreck of my laptop. Annex F ======= Throughout the annex the undefined terms 'subject constituting resource' and 'subject indicating resource' are used. These seem to be the same as 'addressable subject' and 'subject indicator' in the spec proper, and so the annex should be made consistent with the spec proper. Annex F must be made normative, since there are normative references to it throughout sections 2, 3 and 4, which are themselves normative. F.2.5 Should mention explicitly that variants are ignored and catered for by F.6.2. F.2.6 Occurrences are equal if "the resource data values that are the occurrences are equal [Note : equality of the resource data value is determined by string equality.]". This is a bug, since it means that all occurrences of all topics that are being merged must be downloaded and compared for equality, which is too costly to even contemplate. Furthermore, the string equality principle as defined cannot be applied to binary data. F.3.3 This looks like an equality principle to me, and could be used to simplify F.2.4. F.4.2 This seems to be superfluous, since this constraint is already defined by the DTD. F.5.1 Perhaps F.4 should be called 'Operations' and this section moved there. F.5 could then be called 'Merge conditions'. Also, point 2 of the error conditions section is a bug. To check this a processor will have to download every subject indicator, parse it as an XML document, build a topic map from it, locate the association inside it and do the comparison. This is, again, too costly to contemplate. If this is not what is meant, this should be spelled out. Also, there seems to be no good reason to treat associations specially and not do the same for all other topic map constructs that might be reified and for which equality rules are defined. I think this should be removed. F.6.3 What is the para after the postcondition doing there? --Lars M. ------------------------ Yahoo! Groups Sponsor ---------------------~-~> eGroups is now Yahoo! Groups Click here for more details http://click.egroups.com/1/11231/0/_/337252/_/981625931/ ---------------------------------------------------------------------_-> To Post a message, send it to: xtm-wg@eGroups.com To Unsubscribe, send a blank message to: xtm-wg-unsubscribe@eGroups.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC