Chris, Erik, and Bruce,
I want to bring up a concern that was
discussed at last Monday’s meeting of the Translation Subcommittee. We
are working on an indexing best practice that ensures that index terms do not
interrupt sentence flow for segmentation. In the discussion of index ranges,
the translation professionals were concerned about having to duplicate the
index tag content in the start and end tags. Please clarify for us if that is
indeed the case:
Startindexterm = DITA and Endindexterm =
The concern is that the indexterm may not
be translated exactly the same way by two different translators working on
topics in parallel or even by the same translator working at different times on
different topics. That is – if the index start and end ranges can span
topics (is that the case?).
There was also a concern that having the
same text entered twice might result in a spelling error that would affect
Here are the recommendations that the SC
has been discussing. Please let me know if there are misconceptions.
- Insert index
entries that refer to entire topics in the prolog element using the
<keywords> tag (prolog—metadata—keywords—indexterm).
Index entries using <keyword> should be processed as index terms referring
to the beginning of the referenced topic. clarify
- Insert all
block-level index tags immediately following the start tag of the nearest
containing block element.
- If an index
term is intended to span several elements in one topic, insert the start
range at beginning of start block (i.e., the parent block element) and the
end range markup at end of the end block element. See Chris Wong
- Question: Is
this allowed at all?If you want an index term to span a group of topics,
insert the start range of the index tag in the prolog of the first topic
and the end tag in the prolog of the last topic in the DITA map. Do not do
this. Prolog across multiple topics.
Thanks for your help, … JoAnn
JoAnn T. Hackos, PhD
Comtech Services, Inc.
710 Kipling Street, Suite 400
Denver, CO 80215
From: Esrig, Bruce
Sent: Monday, July 10, 2006 4:23
To: 'Chris Wong'; Erik Hennum;
Subject: RE: [dita] indexing
Well, we may need to discuss it, but here's a position statement.
As Chris Wong wrote,
index entries are point-like by default.
Here's a potential accomodation.
Chris wrote: > We can leave open the possibility that a processor may elect
to treat an indexterm in a topic prolog as a page range: for example, if that
topic is deeply nested.
This one is a tempting accomodation, but
I'll try an argument that justifies not making this accomodation.
Suppose that we are looking at a topic
with no nested sub-topics.
When indexing the first reference to an
item, the entry should generate a point reference to the initial point where
that item enters the discussion. If the item is a prominent item within that
scope, a reference to the initial point is sufficient, because the
reader is likely to be interested in a large fraction of the scope without
being prompted by an index entry.
If the item is a subsidiary item in the
scope and only occurs once, a reference to the initial point is
sufficient, because the item only occurs once.
If the item is a subsidiary item that
occurs multiple times, or if the occurrences span multiple adjacent scopes,
than a page range is appropriate.
Now applying these guidelines to
topic-level index entries ...
A topic-level index entry is an assertion
that the item is a prominent item within that scope. The reference is to the
topic as a whole, and a reference to the initial point is sufficient.
A start-of-range assertion at the topic
level is not well defined. How do you know in a single topic that there will be
other subsequent topics that will address the same item? Ranges are inherently
appropriate for spans across contents of a topic or contents of a grouping of
In a map, a start-of-range assertion does
From: Chris Wong
Sent: Monday, July 10, 2006 6:05
To: Erik Hennum; Grosso, Paul
Subject: RE: [dita] indexing
One question that comes to mind is: why
would you want a page range that spans one and only one topic? For example, I
pulled out my old "XML in a Nutshell" and looked up "Arabic
Unicode block". This table spans 2 pages, but is only indexed with a page
number pointing to the start of the topic. That is because the topic is so
obviously self-enclosed that a single page reference is sufficient.
What I'd say is that an indexterm
in a topic prolog points to the topic. Page range markers in a topic prolog has
no meaning, since the indexterm is out of the content flow. So
index-range-start/index-range-end should be ignored. This will allow an author
to generate an index reference to a single topic by entering an indexterm in
the topic prolog.
We can leave open the
possibility that a processor may elect to treat an indexterm in a topic
prolog as a page range: for example, if that topic is deeply nested.
Sent: Friday, July 07, 2006 3:08
To: Grosso, Paul
Subject: RE: [dita] indexing
(Grosso) and Indexing Enthusiasts:
To follow up on the index range question, we had a fair bit of discussion about
ranges last Fall. The consensus at the time was that ranges should be set
explicitly. A sample from the thread:
I guess my perspective remains that an indexterm in the prolog could be treated
as a special case of a general rule: that an indexterm covers the content of
its container and that processing emits a page range if the indexed container
extends to more than 2 pages.
Even so, I don't want to undo the progress we've made:
Can we isolate any anomalies in the current indexing proposal and fix those
quickly without changing the fundamental approach?
Paul, are you aware of other hiccups besides the requirement to index an entire
topic from start of the title through the end of the related links or the end
of the nested topics?
Chris (Wong), as the lead on the indexing proposal, do you have any
Hoping that's useful,
06/29/2006 05:05 PM
From: Erik Hennum [mailto:email@example.com]
Sent: Thursday, 2006 June 29 18:13
To: JoAnn Hackos
Cc: firstname.lastname@example.org; Grosso, Paul
Subject: RE: [dita] indexing question
That said, we still need a way to
generate a range over the whole topic.
Huh? I would have thought what you just said in the first
paragraph means that an indexterm within the prolog generates a range over the
whole topic. Now I'm really confused.