OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

topicmaps-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Re: [topicmaps-comment] TMs & XTM [Was: skills to create topic maps]



* Tony Coates
| 
| Actually, I was talking with Sam Hunting at XML 2001 about the role
| on "nsgmls" in the history of SGML, and how one of its most useful
| functions was that it could normalise your input SGML into something
| that could then be processed by a less sophisticated application.  A
| similar tool for XTM, one that produced a particular normalised
| style of XTM that could then be processed using XSL-T, would be a
| very good thing indeed.

* H. Holger Rath
| 
| This magical piece of software is necessary for XTM conformance
| testing.  It would normalize the output of XTM processors and make
| them line-by-line compareable.

Actually, I think you two are not necessarily talking about the same
thing. Tony wants a normalized form of XTM, which is something
relatively simple, and pretty much like XTM. It could basically be a
restricted form of XTM where certain types of processing are required
to be performed in advance. (Basically, it would be the infoset model
exported into XTM syntax.)

The canonical syntax, on the other hand, needs to go into lots of
other things as well, such as defining the character encoding, the
order of constructs on export, and so on. This are quite hard, and
requires more effort to implement. It is also decidedly difficult to
specify. (I know, I've tried[1].)

So it may be that these two things should be kept apart. Certainly the
first is a lot easier than the second.

| I remember some TopicMaps.Org discussions in Montreal about
| conformance testing and the problem of normalizing arbitrary XTM
| files which were exported from XTM software. Someone (Murry
| Altheim?) suggested to have a piece of software which takes an XTM
| file and just writes out a cannonical form of it - without (!)
| applying any XTM processing (e.g.  merging). We know that proper
| sorting of the topics will be the real problem (because of
| scopes). Any volunteers to develop this or who can contribute
| valuable input to the design are highly appreciated?

Holger, what is the "it" here? What is its purpose? How does it fit
into the larger picture? (I don't know, and I would like to know
before I comment. :-)

* Murray Altheim
| 
| I can see* three types of "normalized" topic map, each possibly a more 
| successively processed document: 
| 
|   1. XTM-normalized
|      This would be a "sorted" view of an XML document, sorted following
|      some agreed-upon algorithm such as by-ID, by-scope, by-name-within-
|      scope, occurrences-by-URI, etc.  This would allow two XTM documents
|      to be compared using common XML and diff tools, but would not take
|      into account differences due to merging, authoring approaches, etc.

I think this may be quite dangerous. It suggests that it is OK to
import and export an XTM document without doing all the required
processing on it. The XTM specification as rewritten by ISO (described
elsewhere[2], BTW) will make this form of XTM look quite strange.

We'll have a model, a specification describing how to process XTM
documents into instances of that model, and a normative recommendation
for how to get back to XTM documents. The first step will require
certain kinds of merging to be performed.

How will you explain the role of this form of normalized XTM in this
context? I'm not necessarily against this, but I'm concerned that we
need to end up with a coherent set of specifications at the end of the
day. SC34's plans are now quite clear and coherent, and we need to
make sure that we don't mess that up.
 
|   2. XTM-merged-and-normalized
|      This would the exported output from a compliant topic map engine
|      that performed all required topic map merges (and other functions
|      such as duplicate suppression), but still maintains in some fashion
|      the XTM features that might be considered "author intentions" such
|      as ID names and other things necessary for interoperability. For
|      example, if three <topic> elements are merged, their IDs would 
|      be still maintained as almost-empty <topic> elements that pointed
|      to the merged/conglomerated <topic>, so that ID references would
|      still function.

This sounds like the result of following the XTM import/export
procedure to be described by ISO, and then sorting the topics by some
agreed procedure in the exported file. Sounds perfectly fine to me, as
it would enable topic map work to be done using less powerful tools,
which in effect makes topic maps easier to work with and easier to get
started with.

I don't think this is necessarily the same as what Tony is talking
about, but it's getting pretty close to cannonicalization, since there
has to be an agreed-upon order for the topics (and associations). So
probably this could be specified very easily by cobbling together bits
and pieces from the updated ISO standard.
 
|   3. graph normalized
|      This would be a topic map that throws XTM interoperability to the
|      wind and might not even be in XTM syntax, perhaps something akin
|      to Steve and Michel's topic map graph DTD syntax. This might discard
|      name variants, for example, and might have some type of algorithm
|      for handling references to topics (since XLink, links and links to
|      IDs are an XML thing, not necessarily a topic map graph thing).

This one I don't understand. What is it intended to achieve? How is it
going to do that?

| I think each would have uses, and as I mentioned above, might be
| part of a chain of processing.

I think #2 may have some uses. The need for #1 and #3 I don't see, but
more explanation might convince me.

| * BTW, I didn't spend more than about ten minutes thinking about
| this, so there's likely many issues that would be brought to the
| surface by a more methodical analysis. But this issue has certainly
| been on my mind over the past month or so.

What issue, Murray? I see lots of them here, and I'm not sure which
one it is you're looking to address.

--Lars M.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC