topicmaps-comment message

Subject: Re: [topicmaps-comment] TMs & XTM [Was: skills to create topic maps]

From: Murray Altheim <murray.altheim@sun.com>
To: "H. Holger Rath" <holger.rath@empolis.com>
Date: Wed, 02 Jan 2002 13:37:53 -0800

"H. Holger Rath" wrote:
> 
> Tony.Coates@reuters.com wrote:
> >
> > ...snip...
> >
> > Actually, I was talking with Sam Hunting at XML 2001 about the role on "nsgmls" in the
> > history of SGML, and how one of its most useful functions was that it could normalise
> > your input SGML into something that could then be processed by a less sophisticated
> > application.  A similar tool for XTM, one that produced a particular normalised style of
> > XTM that could then be processed using XSL-T, would be a very good thing indeed.
> 
> This magical piece of software is necessary for XTM conformance testing.
> It would normalize the output of XTM processors and make them line-by-line
> compareable.
> 
> I remember some TopicMaps.Org discussions in Montreal about conformance
> testing and the problem of normalizing arbitrary XTM files which were
> exported from XTM software. Someone (Murry Altheim?) suggested to have
> a piece of software which takes an XTM file and just writes out a
> cannonical form of it - without (!) applying any XTM processing (e.g.
> merging). We know that proper sorting of the topics will be the real
> problem (because of scopes). Any volunteers to develop this or who can
> contribute valuable input to the design are highly appreciated?

I can see* three types of "normalized" topic map, each possibly a more 
successively processed document: 

  1. XTM-normalized
     This would be a "sorted" view of an XML document, sorted following
     some agreed-upon algorithm such as by-ID, by-scope, by-name-within-
     scope, occurrences-by-URI, etc.  This would allow two XTM documents
     to be compared using common XML and diff tools, but would not take
     into account differences due to merging, authoring approaches, etc.

  2. XTM-merged-and-normalized
     This would the exported output from a compliant topic map engine
     that performed all required topic map merges (and other functions
     such as duplicate suppression), but still maintains in some fashion
     the XTM features that might be considered "author intentions" such
     as ID names and other things necessary for interoperability. For
     example, if three <topic> elements are merged, their IDs would 
     be still maintained as almost-empty <topic> elements that pointed
     to the merged/conglomerated <topic>, so that ID references would
     still function.

  3. graph normalized
     This would be a topic map that throws XTM interoperability to the
     wind and might not even be in XTM syntax, perhaps something akin
     to Steve and Michel's topic map graph DTD syntax. This might discard
     name variants, for example, and might have some type of algorithm
     for handling references to topics (since XLink, links and links to
     IDs are an XML thing, not necessarily a topic map graph thing).

I think each would have uses, and as I mentioned above, might be part of
a chain of processing. 

Murray

* BTW, I didn't spend more than about ten minutes thinking about this, so 
there's likely many issues that would be brought to the surface by a more
methodical analysis. But this issue has certainly been on my mind over the
past month or so.
...........................................................................
Murray Altheim, Staff Engineer          <mailto:murray.altheim&#64;sun.com>
Java and XML Software
Sun Microsystems, 1601 Willow Rd., MS UMPK17-102, Menlo Park, CA 94025

       Ernst Martin comments in 1949, "A certain degree of noise in 
       writing is required for confidence. Without such noise, the 
       writer would not know whether the type was actually printing 
       or not, so he would lose control."

Follow-Ups:
- Re: [topicmaps-comment] TMs & XTM [Was: skills to create topic maps]
  - From: Sam Hunting <sam_hunting@yahoo.com>
- Re: [topicmaps-comment] TMs & XTM [Was: skills to create topic maps]
  - From: Lars Marius Garshol <larsga@garshol.priv.no>

References:
- Re: [topicmaps-comment] TMs & XTM [Was: skills to create topic maps]
  - From: Tony.Coates@reuters.com
- Re: [topicmaps-comment] TMs & XTM [Was: skills to create topic maps]
  - From: "H. Holger Rath" <holger.rath@empolis.com>