dita message

Subject: RE: [dita] index terms
From: "Chris Wong" <cwong@idiominc.com>
To: <dita@lists.oasis-open.org>
Date: Mon, 3 Oct 2005 18:06:35 -0400
I agree with the general idea that you present that we should not overengineer for the less useful cases, and indeed with most of what you just wrote. I only disagree specifically with the contention that page ranges belongs in the category of an unnecessary frill. Not everyone uses primary entries, but just about every single book index I've seen uses page ranges. It is one of those features that are actively in use in real life by real life authors in real production environments.

Chris

-----Original Message-----
From: Eliot Kimber [mailto:ekimber@innodata-isogen.com]
Sent: Monday, October 03, 2005 1:56 PM
To: dita@lists.oasis-open.org
Subject: Re: [dita] index terms


Erik Hennum wrote:

> For accessing well-defined chunks of information within a minimalist book,
> the index becomes, if anything, more important.

But how important is the *sophistication* of the index? That is, I would 
expect an index to list every useful keyword but I wouldn't necessarily 
expect it to have frills like page ranges or primary entries. Assuming 
authors use topic-level metadata and appropriate "mention" elements, 
somewhere between 80 and 100 percent of a back-of-the-book index should 
be generatable from that data alone, depending on the nature of the 
content and the type of document (easier for reference, less so for 
conceptual or procedural).

I guess what I'm really try to say is, adding features like ranges or 
sophisticated ways to bind multi-level entries to containers or manage 
controlled vocabularies, feel like over-optimization to me, that they 
don't meet the 80/20 cut.

One of the challenges with a system like DITA is that it enables lots of 
really cool ways to do things structurally that can make the core 
information very sophisticated. The more of these that are in the design 
the more temptation there is for the designers to add more of them. I 
know this because I lived it for a number of years in helping to define 
the original IBM ID Doc and then HyTime 2.

But the painful lesson I learned was that by and large most document 
creators are simply incapable of using much of what the designers can 
imagine for the simple reason that the intellectual and labor overhead 
of using the features isn't (or doesn't appear to be at the moment) 
balanced by the value provided by its use. This is the lesson of HTML 
and XML.

So maybe I'm just oversensitive to potential overengineering, but I know 
in my gut that, regardless of all the clever ways that we can provide 
for creating index entries, the vast majority of authors doing indexing 
just want to slap index markers into their content and go on.

The other part of this is that the modular, deconstructed nature of DITA 
content makes defining indexes in particular more involved than it is in 
simpler book-primary structures like DocBook, when you can just define 
the index and go. This means that there is higher cost of design for us 
and cost of use for users to get the same indexing features.  That's why 
I urge caution in evaluating these features.

Cheers,

E.
-- 
W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(512) 372-8841

ekimber@innodata-isogen.com
www.innodata-isogen.com