[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: [topicmaps-comment] RE: [sc34wg3] Re: PMTM4 and XTM Layer 1.0
[Graham Moore:] > TopicMaps typically make the metamodel, i.e. classes and assoc templates > part of the runtime environment, UML typically never has the metamodel > as part of a run time environment. Thats because the run time env is > often java or python etc, which just dont have these features. Perhaps > the closest we get to having access to the metamodel at runtime is > something like smalltalk. Right. > Secondly, topicmaps at the moment priviledge certain things in order to > achieve commonlaity and interchange. These things are names and identity > structures. Occurrence I can quite happily see as Assocs in the model. > Because these things are priviledge in the data model in the layer 1.0 > model, there is no ambiguity about them - interchange and > interoperability is strengthened. > One concern I have about PMTM4 is that becuase everything is so regular, > these distinctions are lost from the basic data model. PMTM4 is reliant > on the fact that people interpret a Data driven model with the correct > semantics rather than the semantics of the model (when I use semantics > here I mean the ability to find a topics name etc, the ability to find > out about the topic map from the data model.) being an integral, > unambiguous, part of the model. Right. > An example, if names are modelled as assocs in PMTM it requires that > there are PSIs, probably, to identify name structures, these have to be > interpreted by SW in order to impose the correct semantic upon these > regular structures. Right. > If naming is a first class principle then it should > be there as such. You seem to be suggesting that the notion of naming should *not* be the subject of a topic. Put another way, you seem to be suggesting that the "naming assertion type" (PMTM4 calls it "the topic-basename association template") should not appear as an association template, and the connection between a topic and a name for that topic should not be an association. Is that right? If that's what you're saying, I disagree with you. I see significant disadvantages, and no advantages. The disadvantages include: * Only one kind of naming. What if a popular application needs to use an altogether different kind of naming (such as naming without the topic naming constraint that several people have decried as unreasonably burdensome)? What if a popular application needs to use only a limited subclass of the standard kind of naming (see below an example of a powerful reason for supporting such subclassing)? How can that be expressed if there's nothing explicit to subclass? Shall we create a completely separate, special mechanism just for names? Ugh. * Inconsistency. Naming assertions would be handled differently from all other kinds of assertions. This would create more mechanisms to implement and understand, and lots of opportunity for many kinds of mischief. We would be unable to apply the whole power of the topic maps paradigm to naming issues and to names. I can imagine arguments in favor of your proposal that would go something like this: Naming is special. It's different from everything else. We need to have namespaces for names, and we need to be able to use names as the addresses of the topics that they name. We need to be able to take advantage of the special features of languages (e.g. Python) that provide convenience features for storing and looking up names. All such paradigms make a big distinction between naming and everything else. Why should Topic Maps be any different? I'm unsympathetic to these arguments. Every assertion type is special, not just topic-basename assertions. There is an unbounded number of assertion types, each of which can impose an unbounded amount of processing complexity when used in support of the applications for which they were created. The basic attraction of the Topic Maps paradigm is that it provides a way to simplify and unify the expression, interchange, merging, and infoglut control of relationships between arbitrary subjects. Surely the concept of naming is itself a subject. Surely, since I can say, "The name 'Graham' is a perfectly nice name," the name "Graham" is itself at least potentially a subject. Simplicity and ease of implementation are not well served by exempting certain subjects from the same discipline to which all other subjects are, uh, subject. (To those readers for whom English is not their native language, I apologize for the previous sentence, which uses the word "subject" in two different senses.) Nothing prevents the exploitation of the special features of various languages (like Python) and systems (like RDBMSs) when implementing the processing complexity of naming, or of any other assertion type. It should not bother us that Topic Maps is different from every other paradigm. (Otherwise, why use Topic Maps? People should use the paradigms that suit their purposes. If they need a paradigm that can absolutely collate everything known about any given set of subjects, then they might want to think about using Topic Maps. And we'd better make sure that Topic Maps can really, really do that.) The purpose of Topic Maps is to bring everything under one umbrella -- to facilitate the merging of arbitrary knowledge from diverse sources. If we make Topic Maps a system of two umbrellas, one for names, and the other for everything else, we will have violated the fundamental purpose of Topic Maps. > Returning to the UML / TM comparison, identity is *in* UML, but it > doesnt make the distinction between resolvable topics and conceptual > ones. However, being UML it could create an class to do this - the point > is that it isnt standardised. I think this is one of the best things TM > offers the world. > > Another UML reference. PMTM4 see everything as a topic, including > strings and I assume integers - this brings it very close to the UML > model that just connects Objects togther. > > As the full PMTM4 model isnt described yet - I dont see what in a data > model a Topic now looks like : > > e.g. > > <topic> > <baseName> > <baseNameString>Graham</baseNameString> > </baseName> > </topic> > > Generate 2 topics in PMTM4 > > 1. topic for the topic named 'graham' > > and > > 2. a topic for the string 'graham' > > Now, what I ask steve and michel is, how does this micro part of the > model pan out, does for example a topic have a type - like string, int, > Object, Topic? > > Topic.value => "graham" ?? > Topic.type => "String" When I read your words above, I have this queasy feeling that Ted Nelson calls "paradigm warp" -- a phenomenon that occurs when subtle differences between world views make communication between people extra challenging. For me, topics don't have "values". In the universe of Platonic forms, they have subjects. That's the closest thing I can think of to the idea that a topic has a "value". In computer-processable terms, the closest thing I can think of to topics having "values" is the fact that topics have subject identity points (subject indicators, plus zero or one subject constituter). Topics can have any number of types, not just one type, and each type (or class) is itself a topic. The type of a topic is never a string, such as "String". It's a topic, which, again, has subject indicators, plus zero or one subject constituter. The nature of a subject indicator is not constrained, except that it must be addressable information. However, the nature of a subject indicator for a topic whose subject is a topic name *can* be considered to be constrained by the XTM syntax. XTM provides a special syntax just for this kind of subject indicator. In XTM, as you point out, it's normally a string which is the literal content of a <baseNameString> element. I know you know all this, Graham. This is all a warmup so I can get past my paradigm warp problem. Please be patient with me. Remember I was a classroom teacher for over two decades; some habits never die. Let's do a little exercise, now, that I hope will be helpful. If it's true that, as PMTM4 claims, all the features of any syntax for topic maps ultimately boils down to a set of assertion types, no more and no less, and if it's true that topics always boil down to subject identity points, no more and no less, then it must be true that we can use a small subset of the XTM syntax -- just the association syntax -- to say exactly the same thing that any specialized feature of XTM syntax allows us to say. Let's test this claim by seeing if we can really do without the very specialized <baseNameString> feature of XTM. (Note: I'm *not* proposing that we get rid of <baseNameString>. This is just an exercise that I hope will be revealing.) For example, how can we say that a topic, such as a specific dog, has the base name "Maxwell", without using <baseNameString>? First of all, there has to be a subject indicator for the dog. Let's imagine that we've chosen the entry for the dog whose name is Maxwell in the American Kennel Club (AKC) registration records. (We can address this entry by means of its AKC registration number, for example, but that's just a detail having to do with the mechanics of addressing this entry in its own proper context. If the AKC sees fit to make these entries uniquely addressable via some system of URIs, so much the better; we can then do this on the Web. Whether the registration is available by means of a URI on the Web or not doesn't matter for purposes of this discussion.) Since we're trying to attach a name to the topic whose subject is the dog, and since we're not allowing ourselves to use <baseNameString>, we must use an association that has, as its template, the "topic-basename" association template. The topic whose subject indicator is the AKC registration entry is addressed as the role-player of the "topic" role. (Actually, as you know very well, there doesn't have to be a <topic> element that explicitly addresses the AKC registration as its subject indicator. We can address the AKC registration entry directly from the <association> element, using a <subjectIndicatorRef>. The fact that we make this reference has the effect of demanding the existence of a topic node, among whose subject indicators is the AKC registration entry.) How will we make the name "Maxwell" play the "basename" role in our "topic-basename" association? For that purpose, we use a <subjectIndicatorRef> that contains the address of the string, "Maxwell". It doesn't matter where the string actually is, as long as we say that this string is in fact to be treated as subject indicator of the topic that is the role-player of the "name" role. At this point, some questions arise. (1) Why should we use a <subjectIndicatorRef> rather than a <resourceRef> to address the string, "Maxwell"? If we used <resourceRef>, we'd be saying that only the particular instance of the string, "Maxwell", that we happen to be addressing is in fact the name of the dog. This wouldn't make sense. It doesn't matter where the string "Maxwell" occurs; wherever it occurs, it is one and the same name. The name "Maxwell" is an abstract subject; it exists as a single unique Platonic form in the Universe of Platonic Forms. (2) What if somebody wants to regard that particular instance of the string, "Maxwell", as a subject constituter? No problem. That's a different subject. Every addressable piece of information can be regarded as the subject identity point for exactly two distinct subjects: (i) the subject that is somehow compellingly *indicated* by this piece of information, including consideration of its context, and (ii) the subject that *is* (i.e., *is constituted by*) this piece of information itself, including consideration of its context. (If another copy of it appears elsewhere, that copy is not the same subject constituter.) (3) Look, OK, I have a topic whose subject indicator is the string, "Maxwell". How am I supposed to know that this is a subject indicator for a name? In other words, how do I know that the subject of the topic is the *name* "Maxwell", rather than being, for example, the concept of maximum health, or the deepest well in the world, or Jack Benny's infamous 1925 Maxwell brougham, or the name of a house that's full of coffee? Very good question. (i) First of all, I could try to duck this issue by invoking the fact that subject indicators have whatever meaning they have to whomever perceives them. This is a very unsatisfactory answer. I hate this answer. The very essence of the issue we're discussing is: Whence cometh a name's name-ishness? If we say, "It's all in the mind of the beholder," we retreat into some sort of unimplementable philosophical fantasy land. Computers don't have minds, and they don't behold anything. (ii) Well, then, what about the fact that this addressed string plays the "name" role in a "topic-basename" association? Doesn't that establish that the string is in fact a name? Well, I would say "Yes", except that I hate this answer, too. The reason I hate this answer is that it relies on a doctrine that I believe to be inimical to information interchange via topic maps. I have always strongly resisted this dangerous and false doctrine, and I'm still resisting it, even in this case. The doctrine is: We should be able to tell what the subject of a topic is by analyzing its characteristics (i.e., in PMTM4 terms, by analyzing all the associations in which it plays roles). Here is my argument against this doctrine: (a) Computers aren't smart enough to do that kind of analysis reliably. (b) People aren't smart enough, either. There isn't necessarily enough information to make such an analysis possible, much less reliable. A topic can exist even if it has no characteristics at all (PMTM4: even if it plays no roles in any associations). (c) The usefulness of the whole Topic Maps paradigm rests on the assumption that there is exactly one utterly changeless subject at the heart of every topic. If Joe makes a topic in his topic map and fails to provide it with a compelling, precise, and unambiguous subject indicator, and Betty comes along and adds another assertion in which Joe's topic plays a role, how can Betty do that without knowing what Joe was really regarding as the subject of that topic? Now Natalie comes along, and sees both Betty's and Joe's assertions about this topic, she makes her own assumption about the subject of the topic, based on everything Joe and Betty said, and adds some more assertions. See the problem? Here we have a topic with no real anchor. There is no longer one subject at the heart of it, unless Betty and Natalie have psychic powers and can read Joe's mind. (By the way, Joe wrote his topic map and immediately died, so nobody can ask him what he was thinking about. What a mess.) We aren't likely to have information interchange here. We're much more likely to have confusion interchange, and it might be very dangerous. If we want Topic Maps to work reliably in the real world, we must do better than this. We can't blithely assume that people can tell what we're talking about just because they can see what we said about it. We might not say very much. (iii) The string "Maxwell" is, by itself, a lousy subject indicator. It is not precise, nor unambiguous, nor compelling. It sucks. However (and this is a very big however): the string "Maxwell" has context, and the context of a subject indicator can be extremely significant. For example, the string "Maxwell", when appearing between <baseNameString> tags, is very precise, unambiguous, and compelling. We know it's a name, and we know that it's a name completely independently of any assertions in which the topic that is the name "Maxwell" plays any role. (In XTM syntax, we also know which topic it's the name of, by virtue of the <topic> element in which the <baseNameString> appears, but I'm trying to ignore that for the moment. I'm trying to show that, when all is said and done, we only need an assertion, like any other assertion, to represent the fact that some particular dog has the name "Maxwell".) Similarly, all by itself, the entry for a particular dog in the AKC registry would be a lousy subject indicator, but the same information is a *great* subject indicator when its context is known to be the AKC registry. Let's imagine that there is a field in such registry entries for the common nickname of each dog (as opposed to the weird and lengthy unique names that each dog is given in the AKC registry, such as "Leed-A-Way Honey Girl" and "Smokey Wind Jerry"). If we address the content of that entry, and the content is "Maxwell", we have a name, and even a computer can know that it's a name. (4) OK, Steve, you've won a lot of points here, but there's a fatal flaw in your vision. Suppose there are two topics, and they both have the same subject -- the dog whose name is Maxwell -- and each of them plays the "topic" role in a "topic-basename" assertion, and in both of these "topic-basename" assertions, the player of the "basename" role is the string, "Maxwell". Unfortunately, though, the string "Maxwell" is referenced by one of the "topic-basename" assertions in the context of <baseNameString> markup, and the string "Maxwell" is referenced by the other "topic-basename" assertion in the context of the AKC registry entry for this dog. If these two strings really indicate the same subject (the name "Maxwell"), then they must both be subject indicators for one and the same topic, after all merging has been completed. How is this magic merging supposed to happen? Well, if this is a "fatal flaw", it is the fatal flaw of the Topic Maps paradigm as a whole, not just of the topic naming problem we're discussing. We have never claimed that computers would always be able to detect situations in which two different subject indicators actually indicate the same subject. We have always said that this kind of problem can only be attacked with heuristics and human sweat. What we *have* claimed is that the topic maps paradigm can be exploited in such a way as to preserve the value of such hard work, even when, in a topic map that is the result of merging other topic maps, we need to replace one of the contributing topic maps with a new version of itself. And this claim remains just as valid for name topics as for any other kinds of topics. > I am concerned that by making everything a topic here we are getting > into a lot of other issues, such as data types etc, and if this is > the route we want to take then I think that the large similarities > with UML and the fact that it does all of this stuff already, we > should consider adapting the UML metamodel model to have the new > properties of TopicMaps. How can we make UML reflect the inclusion and full participation of the model (the taxonomy of topic types and assertion types) in the data? Personally, I cannot accept the idea that the model must be separate from the data. That would be inconsistent with the central claim of the topic maps paradigm, which demands that there is exactly one nexus (i.e., one topic) for any given subject, no matter what it is, and that that nexus is connected to every single thing that is known about that subject. > One last question that got me started thinking about PMTM4 integration > with a higher abstraction, > in the above exmaple most people - users? - would imagine they had added > 1 Topic and 1 String into the system. > Asking PMTM4 > > TopicMap.getTopics().size() || TopicMap.topics.length || Card(topic) || > etc > > would yield *2* > > Asking a higher level of abstraction would yield *1* > > At the moment I see this as a BIG stumbling block to getting a > integrated model. If I hope i've missed something obvious. I think I understand what you're saying. I think it's the same issue that Martin Bryan and I have been discussing in another sequence of notes. It's a reasonable and necessary requirement that we not create a situation in which our users discover, to their dismay and chagrin, that they have created topics (and, for that matter, associations) that they didn't intend to create. If a user creates a topic, and he gives it a name, then he doesn't think in terms of having two topics. He thinks he has a topic that has a name, full stop. (And, at that level of abstraction, he's absolutely right!) The issue here should not be considered in terms of what's really in the topic map. The real issue is how the user *views* what's in the topic map, and, if we believe the ancient SGML dogma that information always turns out to have unforeseen and unforeseeable uses, we need to fully protect the flexibility available to *all* kinds of applications that might someday be used to create such arbitrary views. This doesn't diminish the importance of your more specific concern (and Martin's). We also need to meet the reasonable requirement that authors be able to provide guidance to viewing applications that will indicate what the author thought users should see and not see. In other words, an author should have the privilege of making distinctions between the topics that users will normally be expected to see, and the topics that users will not normally be expected to see. And, as it happens, we already have a way to differentiate topics from each other in any way (and for any reasons) whatsoever: it's called "associations." There are at least two good ways to make the distinction that we're talking about here, using associations. Personally, I prefer the technique in which we assign the semantic of "visibility/invisibility to users" on the basis of the roles played in various association types. For example, we could say that if a topic plays the "basename" role in one or more topic-basename associations, and it plays no other role in any association type that demands visibility, then it stays invisible, according to any application that respects that distinction. So, a user would get a "1" in your example, given an application that respects the distinction intended by the author. Note that the technique I'm proposing here requires the ability for association templates to be subclassable by applications, and specifically that the topic-basename association template be subclassable. The subclass would add the visibility/invisibility semantic to its "basename" role. I'm not aware of any model of topic maps other than PMTM4 that offers this feature. I think it's crucial. Let me wrap up, now. In PMTM4, a name can itself be a topic, while at the same time being a name in every way. This means that a single (set of) subject identity point(s) is the single nexus of everything to do with that name, including but not limited to its topic-name-ishness. Anyone can say anything about that name in the usual way, by means of any kind of assertion (association). If, contrary to the grand simplification proposed by PMTM4, we say that topic names are not topics, then we can't do that, and, consequently, topic maps are not fully mergeable. Ugh. Similarly, if we say that a name *can* be a topic, but that the idea of a particular name of a particular topic *can't* be *exactly the same thing* as a topic whose subject happens to be the same name of the same topic, then, again, we can't truly merge topic maps, because: * We can't tell, from the perspective of the name-ishness of a name, what other kinds of things are being said about the name itself, and * Conversely, we can't tell, from the perspective of the other things that are being said about the name, that it is also the name of a topic. PMTM4 fully rationalizes this problem. A topic name is always itself the subject of another topic. Thus, topic maps really work, and there's no problem. There is nothing we can't talk about, and anything we do talk about has everything we say about it directly connected to it. -Steve -- Steven R. Newcomb, Consultant srn@coolheads.com voice: +1 972 359 8160 fax: +1 972 359 0270 1527 Northaven Drive Allen, Texas 75002-1648 USA
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC