Perhaps a flurry of semantic technical philosophy
would help to defuse this debate ...
Users are not assuming that they are writing documents in a
semantically well-founded language with well-known static semantics and
processing semantics.
It is we who are appealing to semantics and formal language
modeling to express criteria for determining what constructs to specify and
support.
Once a construct is selected, there may well be debate
regarding its underlying semantics, either from the point of view of static
semantics (what does it mean to say that?) or processing semantics (what happens
when I say that?).
In the case of index ranges, we didn't specify what it
means to put an index term in a prolog (although in this thread, I tried!). What
we specified was that an index term is legal in a prolog, and (so far) that it
doesn't make sense to specify an index range in a prolog. The wiggle room in
what was specified leaves an opening for alternate interpretations. Although we
might debate which of the alternate interpretations we consider preferable,
ultimately, we need to ask whether it is within the scope of the specification
to resolve each debate.
The purpose of the specification is to ensure that there
are constructs that people can use to write documentation and perform related
tasks. There has to be enough commonality among the interpretations that the
source materials are reasonably portable. That means that each set of compliant
source materials must produce reasonable output wherever it is ported to. If
there is one domain within which an index term in a prolog generates a page
range and another in which it doesn't, that is perfectly acceptable, provided
that we leave that wiggle room in the specification.
The question for the specification is whether there is a
compelling reason to remove the wiggle room. In this case, there might not
be.
Best wishes,
Bruce Esrig
Again, you misunderstand my point Michael: the
mere introduction of page ranges in an index where they weren't allowed before
changes the meaning of every single page reference in that index.
Where
before they could be pointing to extended discussions, now they must be
interpreted as pointing to brief mentions.
Thus using even a single
ranged indexterm breaks backwards compatibility for the writer that uses
it.
Michael Priestley wrote:
Fair enough - not every writer,
just every writer who makes use of an index range somewhere in their
deliverable, or has content reused by someone else who makes use of an index
range somewhere.
So:
- it breaks backwards compatibility for
every context that uses index ranges - it breaks best practices for indexing
Can we call this an interesting idea but not
appropriate for the spec and move on to the next issue?
Michael Priestley IBM DITA Architect and
Classification Schema PDT Lead mpriestl@ca.ibm.com http://dita.xml.org/blog/25
that's not correct, Michael - it only requires writers
who wish to make use of ranged indexterms elsewhere to rewrite their
content
if they don't, no reworking is required
Michael Priestley wrote:
If we follow your suggestion then we're throwing a switch that
requires every writer currently using indexterms in prologs to rewrite their
content to preserve their existing behavior.
I think it makes the most sense
both from a new user perspective (per JoAnn's indexing best practice points)
and from an existing user perspective (per my backwards compatibility
points) to say that indexterms without ranges behave exactly the same way
tomorrow as they do today.
If a particular project wants the behavior you
describe, they can write their content that way (ie with index range
elements), or override processing to change the default behavior (ie get
range outputs from indexterm markup).
Michael Priestley IBM DITA Architect and
Classification Schema PDT Lead mpriestl@ca.ibm.com http://dita.xml.org/blog/25
What if we look at this
new feature as throwing a switch?
If a writer doesn't make use of it,
and refrains from inserting even one ranged indexterm into a book, then they
get 1.0 pointwise processing.
If, however, a user inserts even one
ranged indexterm into a book, then the ambiguity inherent in their legacy
indexterms is resolved as follows:
- indexterms that appear in the body of the text are
considered pointwise. If they aren't, then the writer needs to insert new
start attributes and end elements into the body of the text.
- indexterms that appear in topic metadata are considered
to apply to the topic as a whole, and as such generate a page range in the
index entry that corresponds to the page range of the topic. If the writer
doesn't like this, they need to go in and move the offending indexterms to
the most appropriate point in the body of the text.
Dana
Chris Wong wrote:
"A distinction is sometimes
made between continued discussion of a subject (index, for example, 34-36)
and individual references to the subject on a series of pages (34, 35, 36).
" -- 17.9, Chicago Manual of Style
I'd say that the
difference between a page range indexterm pair and a series of individual
indexterms would make that distinction. Never assume that the page
references should be combined. I'd ask whether clarifying an ambiguity
in the standard is incompatible. If we strive to cater to every possible
interpretation of any ambiguity in the spec, we'd drive ourselves batty. I'm
of the opinion that our spec really says what the user can do and
makes no attempt at a comprehensive list of what a user cannot do.
The latter would need an inconveniently large truck to hold the resulting
document. So if a user writes DITA and expects processing behavior that the
standard does not expressively support, that user should not expect that
nonstandard behavior to be implemented by everyone. Indeed, expecting an
unpromised feature of DITA would easily lead to interoperability problems
even within a DITA version, let alone across
versions. As I see it, this is probably not that big an issue because the
XML itself will continue to be valid, and the user can continue to use
legacy processing. Such XML cannot interoperate across DITA 1.0
implementations anyway. Chris
From: JoAnn Hackos [mailto:joann.hackos@comtech-serv.com] Sent: Tuesday, August 15, 2006 1:47
PM To: Grosso, Paul; dita@lists.oasis-open.org Subject: RE: [dita] Are indexterm ranges backwards
incompatible?
I would not agree with the result assumptions. What mechanism
exists for the numbers 5, 6, 7, and 8 to be concatenated into a range 5-8?
A continuous discussion ranging over pages 5-8 does not mean the same
as points referenced by the number 5, 6, 7, and 8. The indexer should be
solely responsible for determining when a range of pages is used, not have
some automatic decision made. JoAnn JoAnn T. Hackos, PhD President Comtech Services,
Inc. 710 Kipling Street, Suite 400 Denver, CO
80215 303-232-7586 joann.hackos@comtech-serv.com joannhackos Skype www.comtech-serv.com
From: Grosso, Paul [mailto:pgrosso@ptc.com]
Sent: Tuesday, August 15, 2006 11:21 AM To: dita@lists.oasis-open.org Subject: RE: [dita] Are indexterm ranges backwards
incompatible? I generally agree with Bruce here.
But I also need
to take issue with: new
ranged indexterms they add would cause these old point indexterms to be
misinterpreted With our existing indexterm markup, you cannot distinguish
between use of indexterms and ranges by looking at the resulting index. An
indexterm marks a point, and the page on which that point falls will be
included in the resulting index. An index range marks a start and end point,
and all pages starting with the one on which the start point falls and
ending with the one on which the end point falls will be included in the
resulting index. Unless one has a fancier indexing process whereby one can, say,
request a bold page number in the index for the most important reference and
italic page numbers for pages on which there are related figures, etc.,
there is no distinction among page numbers in the resulting index.
Looking at the
resulting index, one cannot tell if index-page-range markup was used to
create that index or not. A resulting index entry of:
cheese 2,
5-8, 12 could have been generated by pointwise indexterm markup
throughout the source that just so happened to end up being points on pages
2, 5, 6, 7, 8, and 12. paul
From: Esrig, Bruce (Bruce)
[mailto:esrig@lucent.com]
Sent: Tuesday, 2006 August 15 11:53 To: Dana
Spradley Cc: dita@lists.oasis-open.org Subject: RE: [dita] Are indexterm ranges backwards
incompatible? On the other hand, Dana, This logic could be applied to outlaw any
extension, since every user would have to review every document to determine
whether they had intended to use the extension.
With DITA 1.1, we
clarify that an indexterm designates a point at which to start reading about
the indexed subject. The DITA 1.1 conceit is that this was true all along.
In DITA 1.0, this aspect of the interpretation was unspecified because there
was no way to specify anything else. But if it even makes sense to take
sides on this, it's possible to argue that the default disambiguation is the
DITA 1.1 way. Indexing practice typically presumes that an index entry
refers to a point at which to start reading.
For those who
wish to specify a range of pages possibly not starting at the top of a
topic, a new capability is provided that permits such a specification. The
specification of a range generates a page range in outputs that have page
numbers, such as PDF files. In other outputs, it generates a reference to
the start page only. Best wishes, Bruce Esrig
From: Dana Spradley [mailto:dana.spradley@oracle.com] Sent: Tuesday, August 15, 2006 12:41 PM To:
dita@lists.oasis-open.org Subject: [dita] Are indexterm ranges backwards
incompatible? After
this morning's meeting, I'm starting to think that maybe ranged indexterm
should be considered backwards incompatible with DITA 1.0.
In 1.0, it
is ambiguous whether indexterms point to discussions confined to a single
page, or to extended discussions that begin on a certain
page.
Introducing ranged indexterms removes that
ambiguity.
Users who want to make use of ranged indexterms would need
to go back through their entire document set and replace current point
indexterms with ranged indexterms where appropriate - otherwise any new
ranged indexterms they add would cause these old point indexterms to be
misinterpreted.
Doesn't that amount to backwards
incompatibility?
--Dana
|