dita message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: RE: [dita] indexing question
- From: "Chris Wong" <cwong@idiominc.com>
- To: "Grosso, Paul" <pgrosso@ptc.com>,<dita@lists.oasis-open.org>
- Date: Fri, 21 Jul 2006 10:05:57 -0400
I
think we are talking about a different issue here. The subflow vs inline problem
has to do with how indexterms get presented to the translator. Translation tools
in general don't actually process DITA: they process an intermediate
representation like XLIFF or TTX. In any case, the issue that I hopefully called
"resolved" does not concern the location of the index entry in DITA
output.
On
your issue, I would answer in two ways:
- A
topic prolog sits outside content, so it is irrelevant where it sits relative
to shortdesc. When an indexterm is found in a topic prolog, it points to that
topic and should refer to the start of that topic, not the start of whatever
comes after shortdesc. So it's really a processing issue, and I would argue
that with appropriate processing we can "guarantee that the effect is going to
be the desired one".
- You
can in fact today put an indexterm in a title: you just have to wrap it
in a ph element. This is just a tool/UI issue. That said, I have no objection
to a proposal to have indexterms as direct children of <title>. I agree
that it's an odd omission.
Chris
We could resolve the issue this way. The problem with
this solution is the case where you have a long shortdesc (which precedes the
prolog) so that the point-wise, subflow indexterm in the prolog ends up on the
second page of the topic so your index entry does not have the page number of
the first page in the topic. So, in fact, an indexterm in a prolog really has no
useful purpose, since you cannot guarantee that its effect is going to be the
desired one.
So I could live with this solution provided that we allow
indexterm in other places that it is currently not allowed (e.g., title) so that
a user can ensure they get an index entry pointing to the first page of a topic.
And once we do that, we would then issue a "best practices"
statement saying not to put indexterm within prolog.
paul
From what I understand, the issue of treating
indexterm differently in topic prolog vs content was due to the
misunderstanding that the content of indexterm in content actually
appears as part of that content. Since we have now clarified that
indexterm's content is always a subflow, we can treat indexterm uniformly in
both topic prolog and content. Is my understanding correct that this issue
is now resolved?
Chris
Hi, Rudolfo, Dave, and other index enthusiasts:
A lightbulb went off
for me. I think we're conflating two cases here.
In the case of
<keyword>, the element is an inline within content and a subflow in the
prolog as Rudolfo has stated.
In both cases, <keyword>
identifies a word from a vocabulary. In the inline case, the vocabulary word
is delimited within the flow. In the prolog case, the vocabulary word is
identified as potential metadata for search engines.
In neither case do
we have a base processing expectation of producing published indexes for the
<keyword> element. We have sometimes speculated about the possibility of
generating indexes from inline mentions (in Eliot's term) of vocabulary words,
but I believe we've always deferred that.
By constract,
<indexterm> is a subflow in all cases as Chris has stated. The contents
of <indexterm> must be translated, but the translation of the content in
which <indexterm> is embedded isn't affected in any way by the
positioning of <indexterm>. That is, <indexterm> does not delimit
part of the flow.
In the prolog, <indexterm> is specified as
serving two purposes: feeding index terms to search engines as part of the
metadata and indexing the topic.
Digression: The case could be made to
treat <term> in exactly the same way as <keyword>, using
<keyword> for words from formal languages and <term> for words
from cultural or social vocabularies. That would require adding <term>
to the prolog.
Hoping that clarifies,
Erik
Hennum
ehennum@us.ibm.com
"Rodolfo M. Raya"
<rodolfo@heartsome.net>
"Rodolfo M. Raya"
<rodolfo@heartsome.net>
07/18/2006 02:15 PM |
|
On Tue, 2006-07-18 at 15:03 -0500, David Walters
wrote:
Hi,
New example:
The <p> text is
complete.
<topic>
<prolog>
<indexterm>term one</indexterm>
<indexterm>term two</indexterm>
</prolog>
<body>
<p>Paragraph that contains term one
<indexterm>term one</indexterm>
and term two <indexterm>term
two</indexterm> inside.</p>
</body>
</topic>
If the content of
<indexterm> is
completely ignored when the topic is published as XHTML, PDF or any other
format, then this element should be completely ignored at translation
time.
The content of <indexterm> doesn't need to be
translated if it is only a location marker. The whole element can be replaced
with a tag by the the translation tool.
Best
regards,
Rodolfo
-- The
information in this e-mail is intended strictly for the
addressee, without prejudices, as a confidential document.
Should it reach you, not being the addressee, it is not to
be made accessible to any other unauthorised person or
copied, distributed or disclosed to any other third party as
this would constitute an unlawful act under certain
circumstances, unless prior approval is given for its
transmission. The content of this e-mail is solely that of
the sender and not necessarily that of Heartsome.
| | |
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]