[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: [topicmaps-comment] TMs & XTM [Was: skills to create topic maps]
"H. Holger Rath" wrote:
>
> Tony.Coates@reuters.com wrote:
> >
> > ...snip...
> >
> > Actually, I was talking with Sam Hunting at XML 2001 about the role on "nsgmls" in the
> > history of SGML, and how one of its most useful functions was that it could normalise
> > your input SGML into something that could then be processed by a less sophisticated
> > application. A similar tool for XTM, one that produced a particular normalised style of
> > XTM that could then be processed using XSL-T, would be a very good thing indeed.
>
> This magical piece of software is necessary for XTM conformance testing.
> It would normalize the output of XTM processors and make them line-by-line
> compareable.
>
> I remember some TopicMaps.Org discussions in Montreal about conformance
> testing and the problem of normalizing arbitrary XTM files which were
> exported from XTM software. Someone (Murry Altheim?) suggested to have
> a piece of software which takes an XTM file and just writes out a
> cannonical form of it - without (!) applying any XTM processing (e.g.
> merging). We know that proper sorting of the topics will be the real
> problem (because of scopes). Any volunteers to develop this or who can
> contribute valuable input to the design are highly appreciated?
I can see* three types of "normalized" topic map, each possibly a more
successively processed document:
1. XTM-normalized
This would be a "sorted" view of an XML document, sorted following
some agreed-upon algorithm such as by-ID, by-scope, by-name-within-
scope, occurrences-by-URI, etc. This would allow two XTM documents
to be compared using common XML and diff tools, but would not take
into account differences due to merging, authoring approaches, etc.
2. XTM-merged-and-normalized
This would the exported output from a compliant topic map engine
that performed all required topic map merges (and other functions
such as duplicate suppression), but still maintains in some fashion
the XTM features that might be considered "author intentions" such
as ID names and other things necessary for interoperability. For
example, if three <topic> elements are merged, their IDs would
be still maintained as almost-empty <topic> elements that pointed
to the merged/conglomerated <topic>, so that ID references would
still function.
3. graph normalized
This would be a topic map that throws XTM interoperability to the
wind and might not even be in XTM syntax, perhaps something akin
to Steve and Michel's topic map graph DTD syntax. This might discard
name variants, for example, and might have some type of algorithm
for handling references to topics (since XLink, links and links to
IDs are an XML thing, not necessarily a topic map graph thing).
I think each would have uses, and as I mentioned above, might be part of
a chain of processing.
Murray
* BTW, I didn't spend more than about ten minutes thinking about this, so
there's likely many issues that would be brought to the surface by a more
methodical analysis. But this issue has certainly been on my mind over the
past month or so.
...........................................................................
Murray Altheim, Staff Engineer <mailto:murray.altheim@sun.com>
Java and XML Software
Sun Microsystems, 1601 Willow Rd., MS UMPK17-102, Menlo Park, CA 94025
Ernst Martin comments in 1949, "A certain degree of noise in
writing is required for confidence. Without such noise, the
writer would not know whether the type was actually printing
or not, so he would lose control."
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC