[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: [xtm-wg] An XTM test suite
[Lars Marius Garshol:] > The idea is _not_ a simplified XTM syntax designed to > be simpler to parse and implement. (It will probably > be simpler than XTM, but only because it must.) I don't understand how it can be simpler than the existing XTM syntax. It looks to me as though it must have more element types (such as ones that make topic namespaces redundantly explicit), and that the element types that do correspond (in some sense) to XTM element types will necessarily have different semantics, as well. For example, the Conceptual Model clearly establishes that, under the covers, an occurrence is really a topic-occurrence association. What does this mean for the "canonical output" form? I believe that we must output a topic-occurrence association (note that I did *not* say <association>, I said "association"). There are several such important distinctions. > Certainly, it is _not_ indended as a competitor to > the XTM syntax, only as a tool for making > applications that support XTM 1.0 more reliable. Understood. > This is close to it, yes. The idea is that in this > syntax, any two topic maps that are logically > equivalent will have the exact same serialized > representation. It's a good idea, if we can make it work. > A canonical XTM document must > > - be UTF-8-encoded Why this particular encoding? What does character encoding have to do with it, as long as the mappings between character encodings are unambiguous and explicit? > - have all elements (topic, association, baseName, > topicRef etc) in a specific order, probably based on > the lexical order of IDs and names I don't see how this can work, unless we want to straitjacket the order in which <topicMap> elements and their contents are scanned and processed, and force all applications to keep a record of that order, even though that order has no significance. This is a very unappealing prospect: to require applications to keep track of nonsignificant information, incurring significant overhead just so their conformance to the Spec can be verified. If I were a developer, I'd simply ignore a standard that required me to write software that does things that force my customers to spend money in ways that don't benefit them. There *must* be a better answer than this. The unique identifiers (IDs) of elements found in the content of <topicMap> elements cannot serve as the basis for imposing a canonical order, either. * First of all, many (perhaps most?) of the elements that demand the existence of topics in the application-internal representation are #IMPLIED, so we won't have IDs for all of them. What do we do with the ones that don't have IDs? * Secondly, when we're merging multiple XTM documents, the IDs of the elements aren't necessarily unique. What do we do when two topics have the same ID? > - have all attributes in a specific order (and > possibly conform to the canonical XML specification) OK. (Why only "possibly"? Making everything totally deterministic is the whole point of this exercise.) > - use insignificant whitespace in a pre-determined way OK. > - be consistent (as per annex F) OK, as far as Annex F (I think misleadingly) goes. - have all externally referenced topic map documents merged in Right. > - have only normalized URIs What constitutes "normalization" of URIs? In the topic map paradigm, it's vitally important that two URIs that point to the same resource be recognizable as equivalent. However, some applications will have more intelligence about this than others; some will detect sameness that others will miss, because, for example, some will understand some kinds of fragment identifiers better than others will. We must not create a conformance requirement that prevents application builders from competing on the basis of the amount of intelligence that is brought to bear on the question of whether two URIs actually refer, ultimately, to one and the same resource. We want them to compete on this; the ideal case, in which all URIs that ultimately refer to one and the same resource are known to be doing so, is probably never going to be fully achieved. One way to handle this is to support a user's ability to "dumb down" the URI-comparison processing to some specified level, just for purposes of outputting a canonical form simply for establishing conformance to the Spec in all other Spec-required respects. > - represent all topic map constructs in a single way > (so, for example, <instanceOf> and <scope> will only > ever contain <topicRef>, since <subjectIndicatorRef> > and <resourceRef> are implicit <topicRef>s) This remark leads me to believe that you are thinking in terms of using some version of the XTM syntax as the canonical output syntax, as if XTM syntax were somehow the same thing as this canonical output idea. This is a bad idea, for a variety of reasons, and especially the reasons I've already mentioned in previous notes. Let me add more reasons: * It would be very bad if there were any confusion whatsoever about whether a particular XML element or document is expressed in XTM syntax or in our canonical output syntax. The best way to avoid such confusion is to avoid having element type names in common between the two syntaxes. * Having element type names in common will greatly diminish our (the XTM Authoring Group's) ability to communicate clearly and unambiguously among ourselves. When we say "<topic>", we really must be disciplined in meaning only what that string (<topic>) means at input time, because the corresponding construct that appears in canonical output is not exactly the same kind of thing (for one example of why this is true, see the discussion of topic-occurrence associations, above). If we don't establish these distinctions in our discussions, we will misunderstand each other, and our productivity as a group will be diminished. * Having element type names in common will muddle our thinking as individuals. We must not allow ourselves to make unconscious assumptions about the nature of processed topic map information. The structure of the canonical output must reflect precisely the abstract structure of the application-internal form of topic map information, as it will be defined by the Authoring Group. The syntactic structure of the input documents is irrelevant, and pretending that it is somehow relevant will only blind and confuse us. > | Syntactic equivalences between XTM <topicMap> > | elements, as these are discussed in XTM today, are > | insufficient to define what topic map information > | actually is. > This I don't follow. You seem to imply here that > something more than what I propose above is > needed. My problem is that I have a release schedule > to meet and must act very quickly indeed. So if > something radically more complex is needed I would > prefer to do this first, and then that as a second > stage. OK. In order to walk in a particular direction, we must move by steps. I would only ask that each of us tries to be objective about technical decisions. That means trying not to make technical decisions on the basis of our own individual business objectives, but rather on the basis of how best to develop the industry as a whole. The only thing that competitors can be expected to agree about is how to make the industry grow (and even that much is a minor miracle). I hope there won't be too many conflicts among us, and that the resolution of the conflicts can be navigated in a way that doesn't bruise anyone economically. Taking well-considered steps *together* is a good way to do that. BTW, I'm voting "Yes" on XTM 1.0, although I have grave misgivings about Annex F, which I find misleading -- not so much by what it says, but by what it doesn't say. -Steve -- Steven R. Newcomb, Consultant srn@coolheads.com voice: +1 972 359 8160 fax: +1 972 359 0270 405 Flagler Court Allen, Texas 75013-2821 USA ------------------------ Yahoo! Groups Sponsor ---------------------~-~> eGroups is now Yahoo! Groups Click here for more details http://click.egroups.com/1/11231/0/_/337252/_/982448124/ ---------------------------------------------------------------------_-> To Post a message, send it to: xtm-wg@eGroups.com To Unsubscribe, send a blank message to: xtm-wg-unsubscribe@eGroups.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC