dita message

Subject: RE: [dita] Proposed index range revisions (was Re: [dita] Are indextermranges backwards incompatible?)

From: Michael Priestley <mpriestl@ca.ibm.com>
To: "Esrig, Bruce \(Bruce\)" <esrig@lucent.com>
Date: Thu, 17 Aug 2006 10:40:24 -0400

re 1) remote indexing:
One of the reasons maps scale well is because they focus on the topic level and are thus insulated from changes/churn within the topics they point to. So I'd be wary of introducing a mechanism in maps that allows dependencies on element IDs within target topics. We definitely have a need to index at both authoring and assembly time, but I think we should limit the assembly-time indexing to topic-level, to avoid complicating the map architecture and breaking the separation of maps and topics.

re 2) keyref and scoping:
One of the reasons keyref is CDATA rather than NMTOKEN is to allow scoping where/when necessary, eg using URIs or other package/directory scoping conventions. The exact mechanism isn't determined, but the possibility is definitely there.

re 3) directives for usage/behavior, and control of processing in the markup:
I think directives for usage in the spec are acceptable/appropriate as long as we distinguish required (eg don't put cereal in your eye) vs. recommended (usually taken with milk) vs. suggested serving (nice with strawberries). And when the TC has a deep split on an issue (eg 50/50 on either side of an issue) it makes sense to me to just avoid the issue if possible (in our particular case, simply not document whether there should be any special treatment for the importance attribute on indexterm).

In terms of where to define general processing behavior for a particular deliverable: I agree that some kind of separate file would be appropriate: something like ditaval, or a separate build options file. I think that's out of scope for 1.1 - I believe we already have a proposal for 1.2 about having a style policy file, and the discussion probably has a home there.

Michael Priestley
IBM DITA Architect and Classification Schema PDT Lead
mpriestl@ca.ibm.com
http://dita.xml.org/blog/25

"Esrig, Bruce \(Bruce\)" <esrig@lucent.com>

08/17/2006 08:53 AM

To	"Erik Hennum" <ehennum@us.ibm.com>, Michael Priestley/Toronto/IBM@IBMCA
cc	<dita@lists.oasis-open.org>, "Grosso, Paul" <pgrosso@ptc.com>
Subject	RE: [dita] Proposed index range revisions (was Re: [dita] Are indexterm ranges backwards incompatible?)

There are three questions troubling me now, and they are very different:
1. Remote indexing. Do we need both the ability to introduce index entries when writing a topic and when assembling topics? If so, do we need a remote referencing mechanism from the assembly level (maps) into the specific locations in topics to be indexed? This would be a similar need to that identified for the <data> element.
2. Keyref. Granted, keyref seems like a natural way to do matching of index start and end indications if any innovative mechanism is provided. About the keyref architecture itself ... Will keyref have a cluttered namespace? Is there room for namespacing or scoping in the anticipated keyref architecture?
3. Is it appropriate to allow for directives? The Sperberg-McQueen reference
http://www.idealliance.org/papers/extreme/proceedings/html/2005/SperbergMcQueen02/EML2005SperbergMcQueen02.html
is very helpful in providing philosophical background here. Even if we have an orthodoxy about descriptive markup, we have an obligation to consider what directives we would allow and where they would appear. In explaining my poster at the IA Summit

http://www.iasummit.org/2006/files/169_Presentation_Desc.pdf

one of my impulses was to say that we need to look at the control that we offer to authors over the effect in presentation. We have already conceded that we must provide local markup to allow authors to specify layout facts such as the size of graphics and/or table columns. We've seen a possible need to add markup to an <indexterm> to emphasize a particular index entry.

It's possible that directives that control processing behaviors such as formation of ranges would be appropriate in a high-level document assembly context, although in the case of this particular behavior, a better case might be made for establishing a conventional place for such directives in an associated file (the DITAVAL file?).

Best wishes,

Bruce

From: Erik Hennum [mailto:ehennum@us.ibm.com]
Sent: Wednesday, August 16, 2006 8:38 PM
To: Michael Priestley
Cc: dita@lists.oasis-open.org; Esrig, Bruce (Bruce); Grosso, Paul
Subject: RE: [dita] Proposed index range revisions (was Re: [dita] Are indexterm ranges backwards incompatible?)

Hi, Index Enthusiasts:

For what it's worth, Sperberg-McQueen asserts that an XML specification should "get the key things down in writing without over-restricting things, without over-emphasizing the orderliness that we perceive, without filtering out signal unintentionally." [1]

Trying to keep that judicious big picture for the indexing question, I would think that we should:

Make it easy to indicate _what_ is indexed but leave output decisions up to the process within reason. (In other words, allow a process to emit a page range if an indexed topic spans 20 pages or if 3 continguous topics are indexed with the same term.)
Accept that indexing has an implicit ambiguity and allow processes to interpret an index both as a point / span over the flow for purposes of determining page numbers and as a semantic assertion about the container of the index item.
Rely on specialization to distinguish different types of index items (especially differences in semantics).

Part of the challenge is that indexing is partly contextual (as Paul Prescod has pointed out [mails coming in faster than I can type]):

A topic might have the most important treatment of a subject in one deliverable but not in another.
The best term for the subject may be different in one deliverable than another. For instance, I might need to change the term from "selection" to "query" depending on the other topics and the terms they use or based on the audience. Moreover, I'd like to make that change in one place for all of the topics in the deliverable.

In DITA, the representation of context is the map, which suggests meeting these requirements through the map. However, the positioning of index points and ranges with respect to the flow is clearly best done within in the topic. Moreover, when I reuse a topic, I don't want to have to reconstruct its indexing in each context.

Also, DITA would benefit from a continuum of use -- being able (but not required) to scale up to a rigorous separation of term from its sense (taxonomy, here we come).

In short, we defined DITA 1.1 as the simple cut back in February and have many tough questions remaining that might best be attacked as a whole.

For explicit ranges, my main concern is that we avoid multiplying DITA referencing mechanisms. If we're confident that we aren't introducing a constraining legacy, I'm happy with keyref.

Hoping that's useful,

Erik Hennum
ehennum@us.ibm.com

[1] http://www.idealliance.org/papers/extreme/proceedings/html/2005/SperbergMcQueen02/EML2005SperbergMcQueen02.html

References:
- RE: [dita] Proposed index range revisions (was Re: [dita] Are indexterm ranges backwards incompatible?)
  - From: "Esrig, Bruce \(Bruce\)" <esrig@lucent.com>