dita message

Subject: RE: FW: FW: [dita] indexing question

From: Erik Hennum <ehennum@us.ibm.com>
To: "Tony Self" <tself@hyperwrite.com>
Date: Thu, 20 Jul 2006 07:57:32 -0700

Hi, Tony:

That's a pretty interesting thought. As part of such a strategy, I'm wondering if it might sense to map a list of synonyms to a list of index terms.

If I'm understanding you correctly, in this approach, someone might list the terms and their index equivalents at the top of the map. Here's some strawman markup to indicate that, within the scope of this map, instances of the "SELECT" keyword or the "database selection" term should be indexed under the three supplied index terms:

. <map>
.... <indexdefs>
........ <indexitem>
............ <keyword>SELECT</keyword>
............ <term>database selection</term>
............ <indexterm>query</indexterm>
............ <indexterm>retrieve</indexterm>
............ <indexterm>database <indexterm>query</indexterm></indexterm>
........ </indexitem>
.... </indexdefs>
.... many scanned topics
. </map>

The process would scan through whatever topics are pulled into this map and build lists and ranges based on the proximity of the keyword and term instances (and perhaps on the location -- if a term occurs in the title or short description, the case could be made that we might should create a range for the entire topic).

Is that the idea?

This approach has an appealing concreteness -- the language the writer uses to cover a concept in the content is mapped to the language the user might use for the same concept -- and also because it offers the potential to adapt to reuse. If I add or remove topics, I don't have to change my mappings.

Because we haven't explored the idea and because we committed in February to a disciplined approach to getting 1.1 out the door, I think we are probably obliged to defer consideration. For instance, there might be some interesting convergence with implicit linking:

http://www.oasis-open.org/apps/org/workgroup/dita/download.php/15122/IssueNumber43.html

Hoping that's interesting,

Erik Hennum
ehennum@us.ibm.com

"Tony Self" <tself@hyperwrite.com> wrote on 07/20/2006 05:26:07 AM: > G'day all > > I'm relatively new to this TC, and have been lurking for a little > while to get the lay of the land. > > I may have missed some of the indexing debate, so please forgive me > if this suggestion has already been floated. > > With respect to the index range "sub-debate", if an <indexterm> had > a "grouping" attribute, could that be used as a means of identifying > a range? For example, the prolog of five topics might include > <indexterm range="alpha">term1</indexterm>. If an output included > those topics in a contiguous block, and ended running across four > pages of a PDF document, the processor should be able to note the > common "range" attribute, and thus know to present the index keyword > as "term1 pp13-16", rather than "term1 13,14,15,16". If the topics > were not contiguous in another output, the index could still be > correctly presented as "term1 29, 34, 90, 93", for example. > > I think this would still work for inline use? The processor could > work on the logic that if term1(alpha range) appeared on pages 4 and > 5, the term would appear in the index as 4-5. If the term without > the "alpha" range attribute also appeared elsewhere on page 5, the > index would appear as "term1 4-5, 5". > > The downside is obviously the extra overhead in inventing range > attributes, but it may be an easier pass in having indexterms > spanning blocks across topics.

References:
- RE: FW: FW: [dita] indexing question
  - From: "Tony Self" <tself@hyperwrite.com>