[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: [topicmaps-comment] TMs & XTM [Was: skills to create topic maps]
"H. Holger Rath" wrote: > > Tony.Coates@reuters.com wrote: > > > > ...snip... > > > > Actually, I was talking with Sam Hunting at XML 2001 about the role on "nsgmls" in the > > history of SGML, and how one of its most useful functions was that it could normalise > > your input SGML into something that could then be processed by a less sophisticated > > application. A similar tool for XTM, one that produced a particular normalised style of > > XTM that could then be processed using XSL-T, would be a very good thing indeed. > > This magical piece of software is necessary for XTM conformance testing. > It would normalize the output of XTM processors and make them line-by-line > compareable. > > I remember some TopicMaps.Org discussions in Montreal about conformance > testing and the problem of normalizing arbitrary XTM files which were > exported from XTM software. Someone (Murry Altheim?) suggested to have > a piece of software which takes an XTM file and just writes out a > cannonical form of it - without (!) applying any XTM processing (e.g. > merging). We know that proper sorting of the topics will be the real > problem (because of scopes). Any volunteers to develop this or who can > contribute valuable input to the design are highly appreciated? I can see* three types of "normalized" topic map, each possibly a more successively processed document: 1. XTM-normalized This would be a "sorted" view of an XML document, sorted following some agreed-upon algorithm such as by-ID, by-scope, by-name-within- scope, occurrences-by-URI, etc. This would allow two XTM documents to be compared using common XML and diff tools, but would not take into account differences due to merging, authoring approaches, etc. 2. XTM-merged-and-normalized This would the exported output from a compliant topic map engine that performed all required topic map merges (and other functions such as duplicate suppression), but still maintains in some fashion the XTM features that might be considered "author intentions" such as ID names and other things necessary for interoperability. For example, if three <topic> elements are merged, their IDs would be still maintained as almost-empty <topic> elements that pointed to the merged/conglomerated <topic>, so that ID references would still function. 3. graph normalized This would be a topic map that throws XTM interoperability to the wind and might not even be in XTM syntax, perhaps something akin to Steve and Michel's topic map graph DTD syntax. This might discard name variants, for example, and might have some type of algorithm for handling references to topics (since XLink, links and links to IDs are an XML thing, not necessarily a topic map graph thing). I think each would have uses, and as I mentioned above, might be part of a chain of processing. Murray * BTW, I didn't spend more than about ten minutes thinking about this, so there's likely many issues that would be brought to the surface by a more methodical analysis. But this issue has certainly been on my mind over the past month or so. ........................................................................... Murray Altheim, Staff Engineer <mailto:murray.altheim@sun.com> Java and XML Software Sun Microsystems, 1601 Willow Rd., MS UMPK17-102, Menlo Park, CA 94025 Ernst Martin comments in 1949, "A certain degree of noise in writing is required for confidence. Without such noise, the writer would not know whether the type was actually printing or not, so he would lose control."
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC