dita message

Subject: RE: [dita] index terms
From: "JoAnn Hackos" <joann.hackos@comtech-serv.com>
To: "Eliot Kimber" <ekimber@innodata-isogen.com>,<dita@lists.oasis-open.org>
Date: Mon, 3 Oct 2005 08:46:55 -0600
Agreed that we should dispense with index ranges.

We seem close to understanding see and see also.

We recognize primary, secondary, and tertiary index items and have
mechanism (I think) for designating them.

Is there a possible mechanism for linking the term in the generated
index back to the index term in the topic for editing?

Is there a way to use an attribute to indicate the "primary" index term
for those who are still printing from PDF?

Are we ready to make our proposal on the index terms to the TC? Is Chris
willing to revise his original draft in light of the discussion? I think
we've exhausted this topic.

JoAnn

JoAnn T. Hackos, PhD
President
Comtech Services, Inc.
710 Kipling Street, Suite 400
Denver CO 80215
303-232-7586
joann.hackos@comtech-serv.com

-----Original Message-----
From: Eliot Kimber [mailto:ekimber@innodata-isogen.com] 
Sent: Monday, October 03, 2005 8:27 AM
To: dita@lists.oasis-open.org
Subject: Re: [dita] index terms

Erik Hennum wrote:

> The problem is architectural:  properties that span multiple topics
should
> be specified in the map context and not in the topic content.

I agree with Erik here: ranges that span topics cannot be done using 
embedded index entries. They must be done at the map level.

Note that this is a special case of the more general problem of managing

linking among re-used components (for example, the case where Topic A 
wants to have a cross reference to Topic B).

The only way to support this type of direct linking in a re-use 
environment is to do it in such a way that the links are bound to the 
use context, not the individual components being used.

In DITA terms this means "at the map level" because it is the map that 
defines a specific use context, that is, a unique packaging of a set of 
components for a specific delivery purpose (what I usually call a "unit 
of publication").

However, having said that, it's difficult for me to imagine that few, if

any, DITA users would actually eat the expense of doing indexing that is

this sophisticated. Given the relatively low retrieval value of 
back-of-the-book indexes for information delivered primarily in 
electronic form it's difficult to see that an authoring group would 
choose to invest its limited resources in indexing rather than some 
other, higher-value aspect of the information.

Any publication group that cares that much about indexes is probably a 
print-primary group for whom DITA is not an approprate choice in any
case.

Therefore, I do not see any compelling reason to try to design an 
explicit index range mechanism for DITA, at least not one that can span 
topics.

> A last consideration.  The <term> and <keyword> elements delimit
controlled
> vocabularies that are embedded in the discourse.  Should the writer
have to
> add an index marker to index such instances of controlled
vocabularies?  Or
> would we be better off indexing delimited vocabularies (possibly under
the
> control of policies)?

I would expect a full-featured DITA processor to provide the option of 
including all keywords, terms, and other clearly-defined "mention" 
instances in a back-of-the-book index.

The only downside here is that there's probably no reliable way to infer

a multi-level hierarchy but that's probably not a big problem in
practice.

I would normally expect explicit index entries to be an escape for 
authors when the use of existing classifying metadata and "mention" 
elements isn't sufficient to produce a usable index or acceptable 
retrieval performance.

Cheers,

E.

-- 
W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(512) 372-8841

ekimber@innodata-isogen.com
www.innodata-isogen.com
Follow-Ups:
- Re: [dita] index terms
  - From: Eliot Kimber <ekimber@innodata-isogen.com>