topicmaps-comment message

Subject: RE: [xtm-wg] Dynamic Generation and Serving of Topic Maps
From: Peter Jones <peterj@wrox.com>
To: "'xtm-wg@egroups.com'" <xtm-wg@egroups.com>
Date: Wed, 19 Jul 2000 10:50:09 +0100
Comments inserted below...

I think we seriously need to engage with the problem of topic map fragments.

cheers
Peter

-----Original Message-----
From: Steven R. Newcomb [mailto:srn@techno.com]
Sent: 17 July 2000 21:29
To: xtm-wg@egroups.com
Subject: Re: [xtm-wg] Dynamic Generation and Serving of Topic Maps


[Peter Jones:]
> Assume that I have a tool that is a lot like a search engine, that
> fetches a pile of interesting media of one format or another in
> response to a query.  Assume also that the pile fetched has no
> coherent TM already defined for it in a single TM document
> somewhere, but that each bit of stuff is referred to in various TMs'
> topics and associations. At the same time as the pile of stuff is
> fetched for me I want to have a minimal TM constructed for it, one
> that covers only those topics referred to in this set of stuff and
> any relevant associations that I can grab.

To make a selection from some set, you must first obtain the set
from which you want to select. 

[PPJ] Can't I just know the type/properties of the set and some location
within it. Less overhead.

 I think maybe what you're asking is
that we standardize some algorithm for establishing what HyTime calls
an "application-defined bounded object set".  The "bounded object set
(BOS)" is the set of resources for which a HyTime application (such as
a Topic Maps application) is responsible.  "Responsible" means
"knowing about all the links, and knowing where all their anchors
are".  I think such an algorithm may well be a reasonable thing to
standardize, but I don't yet understand exactly what are the
requirements that such an algorithm must fulfill.  We should talk
about this in a meeting.

[PPJ] In the situation I am describing (I will have another stab at
improving the communication of this in a forthcoming mail) I envisage the
creation of the BOS as something that takes place after the retrieval (see
also later comments about scope and integrity in this mail).
As I don't yet have adequate understanding of HyTime yet I can't judge how
flexible a BOS is. How open to revision at run-time is it?

You also seem to be suggesting that we standardize some algorithm for
selecting from a topic map only those constructs that are relevant to
some set of resources.  I question the general utility/advisability of
this idea.  To make a selection from a topic map may "edit it to
death" -- e.g. by invalidating the scopes that contain themes that are
no longer present in the "selected" version.  

[PPJ] If the addthms are always required to be at the top of an XTM doc a
suitable compromise can be reached(?).

Even if the selected
portion(s) of the topic map do not require any other parts of the
topic map in order to have integrity, the topic map author's
conception of the structure of knowledge will still be seriously
affected; it's just not the same topic map any more. 

[PPJ] Yes. It isn't. (Echoes of Roland Barthes on the "Death of the Author).
But does that completely kill its utility?

 I'd be happier
to leave this whole question (i.e., the question of how topic maps can
be made from other topic maps, and of how topic maps should be
presented in particular contexts) to applications. 

[PPJ] It might be smart to specify some sort of default association that TM
processors must implement something to the effect that in the absence of any
defined associations connecting up topics found in the BOS, these will be
automatically attached to a 'DefaultAssoc_MemberOfThisDocForNow' type assoc.

 In the context of
a particular rendering application, it may be reasonable to hide
topics that have no occurrences, but it doesn't seem to me a
reasonable assumption to make on behalf of all topic map applications,
and on behalf of all topic map authors.

[PPJ] Sketch of an algorithm for this forthcoming

[Ann Wrightson:]
> 2 approaches:

> 1) Computing a new TM from pre-existing TM does not necessarily make
> the pre-existing ones visible as part of the new TM.

But it does require understanding the pre-existing TM, in order to
make selections from it.  I don't see how this reduces overhead in the
way Peter Jones wants to reduce it.

> The application could nevertheless remember the computation in some
> way, and allow the pre-existing TM to be accessed *if required*.

Either the anchors are known or they are not known.  That means that
either you have processed the whole bounded object set, or you have
not.  You can't make this computation lazily.  If you have not made
the computation up front, you have no way of knowing, when you're
looking at something, what may be linked to it.

[PPJ] I don't agree that on the WWWeb things like this cannot be done
lazily. It would seem to me that in the arena of publicly available topic
maps on the web it is more like a necessity that we be able to do this.

> 2) The new TM could refer to parts of the pre-existing TMs as within
> its own map. Then the application would have no compulsion to regard
> them *as TM*, so reducing various overheads on processing. Again,
> the full shebang could then be dug out if required, piece by piece
> (depending on the application supporting such goings-on, of
> course...)

Either you have processed the pre-existing TM, or you have not.  It
can't be dug out piece by piece, unless the overhead of digging out a
piece is equal to the overhead of processing the entire TM.

Within the topic map document itself, we can't know what associations
a topic participates in without reading and resolving *all* of the
association links.  

[PPJ] See comments about laziness above. I see no problem with iterations
with a set crawl depth.

Within the set of resources mapped by a topic map (the bounded object
set (BOS) that includes those resources as well as the topic map
document itself), we can't know which parts of which resources are
regarded as occurrences without reading and resolving *all* of the
topic links.

[PPJ] If we are assuming that the BOS is something that is indicated in a
root doc, and that the only access to the contributing docs is via that root
doc, then I think we are making some grossly unrealistic assumptions about
the way access to publicly accessible topic maps docs can be controlled.
Think about the way exisiting Search engines on the web just index the whole
shebang and let you dive in at any doc that's indexed.

I realize that some exceptionally simple topic map applications may
only need to provide traversal service from the map to the
occurrences, and not from the occurrences to the map.  This is like
the WWW model of <a href="..."> links, in which you can go to the
other anchor, but you can't start from the other anchor.  However,
this simplifying assumption, if generally applied to topic maps, would
utterly destroy the significance of the phrase "topic map"; it would
be a misappropriation of the "map" metaphor.  A topic map based on
this simplifying assumption would be like road map that wouldn't let
one find and use any appropriate road near wherever one actually was,
in order eventually to get to wherever one wanted to go; all roads
would lead one in the wrong direction -- away from the topic links --
and they could only be entered at the topic links.  If one must first
be at a topic link in order to get anywhere else, it becomes literally
true that "one can't get there from here", no matter where "here" is,
unless "here" happens to be some topic link within the topic map
document.  I therefore claim that "lazy" processing of links and
anchors is incompatible with the whole idea of topic maps.

[PPJ] But you could employ something like C++ compilers use when they give
all those 'unresolved external symbol' errors from the lookup table
(assuming you've got a resource missing). It primes the app to return to the
location it got the TM segment from to go back an look for more if
necessary.

  Yes,
pre-processing of bounded object sets (BOSs) is expensive.  Yes, the
Topic Maps paradigm is not supportable using existing commonplace
Web-centric applications and processing conventions. 

[PPJ] Hmm. I sense sponsorship ebbing away in a matter of femtoseconds.

 Let's accept
these immutable facts and move forward.  There are methods of coping
with these necessary constraints, and at least some of the methods are
both practical and cost-effective.

As far as I know, the XLink recommendation doesn't really address the
question of how a BOS is established or processed.  Therefore, we'll
have to address this question somehow in XTM.  We can either:

* regard the BOS as declarable in the topic map document (and we might
  adopt the HyTime syntax (or some subset of it) for doing that,
  rather than inventing something new), or

* we can establish a new TM-specific standard algorithm for
  establishing/declaring/determining the BOS, or

* we can refuse (as XLink does) to explain that part of the problem in
  any way.  The only problem with failing to explain it is that topic
  maps won't be reliably interchangeable between applications.  (I
  personally find this last alternative unacceptable.)

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 359 8160
fax    +1 972 359 0270

405 Flagler Court
Allen, Texas 75013-2821 USA

------------------------------------------------------------------------
Replace complicated scripts using 14 new HTML tags that work in
current browsers.
Form the Web today - visit:
http://click.egroups.com/1/6341/4/_/337252/_/963865705/
------------------------------------------------------------------------

To Post a message, send it to:   xtm-wg@eGroups.com

To Unsubscribe, send a blank message to: xtm-wg-unsubscribe@eGroups.com

------------------------------------------------------------------------
Create professional forms and interactive web pages in less time 
with Mozquito(tm) technology.
Form the Web today - visit:
http://click.egroups.com/1/6342/4/_/337252/_/964000117/
------------------------------------------------------------------------

To Post a message, send it to:   xtm-wg@eGroups.com

To Unsubscribe, send a blank message to: xtm-wg-unsubscribe@eGroups.com
Follow-Ups:
- [xtm-wg] lazy processing vs. extended XLink and BOS processing
  - From: "Steven R. Newcomb" <srn@techno.com>