dita message

Subject: Re: [dita] Scenario for cross-deliverable referencing
From: Eliot Kimber <ekimber@reallysi.com>
To: Michael Priestley <mpriestl@ca.ibm.com>
Date: Tue, 13 Sep 2011 13:45:27 -0500
In thinking about this more, I think that Michael's approach of thinking of
the rendition-specific key-to-target binding as being a literal DITA map
with literal key definitions is a useful one. It provides a clear syntax for
capturing the binding for interchange purposes, will always work for
distributed processing scenarios, and gives us a clear basis on which to
discuss data details. I will use this approach from now on in my
discussions.

I explore the processing implications and possibilities in some detail
below, but I think my difference with Michael comes down to:

Is it possible to keep the two key spaces for two publications distinct or
must you combine them? I say keep them distinct by enabling addressing of
keys in the context of specific root maps. Michael says combine them so that
existing processors "just work" once you swap in rendition-specific key
bindings, at the cost of requiring coordination of the key names across the
maps involved.

If we stipulate that in both cases the actual rendition processing to create
working links is done by "swapping out" the target map as authored for an
equivalent map that contains rendition-specific key bindings, then there are
no actual processing differences in our two models--the only difference is
in the details of how those rendition-specific maps are coordinated or used,
which is all implementation detail.

That is, the processing doesn't require or disallow my all-knowing
Processing Manager and fully allows Michael's completely informal and
distributed processing environment. All the differences in these two models
are implementation details.

This "swapping out" must be done by specifying a mapping from the target map
as authored (e.g., "map-b.ditamap") to the rendition-specific map to use
instead (e.g., "map-b-PDF.ditamap") as an input to a rendition process. That
mapping has to be known regardless of how the processing is done. It could
be specified as a parameter or it could be specified as instructions to the
human setting up the production, who then reflects the knowledge by
modifying the input map. The functional result is the same.

Given that mapping, a human or processor can thus reliably render key
references as authored to working links in the rendition, as long as the
rendition-specific key bindings are correct.

Thus the mechanism by which rendition-specific keys are communicated to or
used by a processor is an implementation detail. The only question is what
does the processor have to do to resolve the keys? Do they always have
exactly one key space or do they need to handle one or more key spaces?

My approach, which requires a new fragment identifier in order to point to
specific keys in the context of specific root maps, reliably keeps distinct
key spaces distinct and removes the need to coordinate names across key
spaces. I think this is essential. It requires processors to handle one or
more key spaces, but I don't think that should pose a problem in practice
because if you can construct one key space you can just as easily construct
100 key spaces. Since the universe has more than one map I would hope that
engineers of DITA-aware systems instinctively provide for the possibility of
multiple key spaces.

Michael's approach requires combining the key spaces of otherwise separate
publications into a single unified key space. This simplifies processing
where the rendition-specific maps are used literally to implement
cross-publication linking using DITA 1.2 processing, but at the cost of
requiring coordination across all the key spaces that might be combined.

I think that this coordination is impossible in the general, distributed
case, because you may want to link to a publication over which you have no
control and that happens to duplicate some keys in your publication that you
do not want to resolve to that publication.

The only solution in that case is to keep the key spaces separate. DITA 1.2
clearly defines the notion of key space so there can't be any ambiguity
about what is intended when you address a key in the context of a given root
map and it shouldn't be a surprise to any processor that there might be more
than one key space in play at any given point in time (because the universe
contains more than one map).

In the case where you have, for some reason, multiple maps that contribute
to a single rendered publication through some process, it would be up to
that process to generate the appropriate rendition-specific map but it could
do it. In that case there might be a many-to-one mapping from maps as
authored to intermediate maps, but the processing will still work just as it
would for the simpler case of one map exactly equal to one publication.

So I think the question remains: do we allow referencing across key spaces
in a way that keeps key spaces distinct or do we require that all maps that
might want to participate in cross-publication links share a single unified
key space that requires coordination of all key names across those maps?

I feel strongly that the latter is not acceptable or sustainable and that
the implementation cost of allowing cross-key-space referencing is low and
is, in fact, arguably inherent in the DITA 1.2 architecture because it
formally defines the concept of key space. In the case of the Toolkit in
particular, I will personally implement the processing required if that's a
barrier.

It is certainly the case that key-aware editors and component management
systems already have to manage multiple key spaces if they allow management
of multiple maps, which they all do as far as I know (e.g., OxygenXML,
Arbortext Editor 6, XMetal 6). [I don't know of any CMS systems that today
actually manage keys or provide key-resolution services but there might be
some. I'm actively working on adding that functionality to our CMS products,
but it's a low product priority right now.]

The purpose of the rest of this message is to try to define a general
abstract processing model or environment that fits both my
tightly-controlled approach and Michael's arbitrarily distributed model. My
intent is to define some common vocabulary and appropriate abstractions that
let us focus on the general requirements with out worrying too much about
implementation details.

Michael is presuming (but not requiring) an environment where there is no
central all-knowing rendition system that maintains knowledge about all the
renditions and the key-to-rendition mappings. I was assuming an all-knowing
Production Manager. But I think for both of us those are implementation
details that don't really change the problem. We were both presuming that
*something* had required knowledge of the renditions involved and the
intents of the renderers--in my case it was a management system, in
Michael's it was the humans requesting the renditions. But I think the
knowledge required in both cases in the same, the only difference is how
that knowledge is captured or communicated, which is an implementation
detail.

The following discussion reflects the real case of the DITA 1.2 spec, where
we have a single content set that needs to be published in at least two
ways: as a single publication combining the Architectural Spec and the
Language Reference and as two separate publications, the Architectural Spec
and the Language Reference, with cross references between the two
publications *as rendered*. I have tried to reflect this case with the
smallest illustrative data set.

Note that in the case of the DITA spec all the content is authored by a
single, coordinated group, so it is possible to coordinate the key names
across all the publication packages that might be applied to the content.
This does not reflect the more general distributed case where you may want
to link to renditions of a publication you only have read-only access to and
for which there is no coordinate of its key names with your key names.

Let us have three maps, Map A, Map B, and Map AB, and two topics, Topic 1
and Topic 2.

The author of Topic 1 creates a link to Topic 2 because Topic 1 depends
rhetorically on Topic 2. This is the DITA Spec case, where the arch spec
points to language reference topics (and visa versa).

Topic 1 looks like this:

<topic id="topic-01">
  <title>Topic One</title>
  <body>
   <p>See <xref keyref="topic-02"/>.</p>
  </body.
</topic>

Topic 2 looks like this:

<topic id="topic-02">
  <title>Topic Two</title>
  <body>
   <p>Something important to Topic 1.</p>
  </body.
</topic>

Map AB includes both topics:

<map>
 <title>Map AB</title>
 <keydef
   keys="topic-01"
   href="topics/topic-01.dita"
 />
 <keydef
   keys="topic-02"
   href="topics/topic-02.dita"
 /
 >
 <topicref keyref="topic-01"/>
 <topicref keyref="topic-02"/>
</map>

This is the full DITA spec case, where all the topics are used in the scope
of a single root map. No processing ambiguity.

The other case is where we have two publications, Map A and Map B:

Map A:

<map>
  <title>Map A</title>
 <keydef
   keys="topic-01"
   href="topics/topic-01.dita"
 />
 <keydef
   keys="topic-02"
   href="????"
   format="????"
   scope="????"
 /
 >
 <topicref keyref="topic-01"/>
 <!-- NOTE: No reference to topic-02 -->
</map>

Map B:

<map>
  <title>Map B</title>
 <keydef
   keys="topic-02"
   href="topics/topic-02.dita"
 /
 >
 <!-- NOTE: No reference to topic-01 -->
 <topicref keyref="topic-02"/>
</map>


Processing the publications:

When Map B is rendered to a given output we can capture the key-to-address
mapping in some way, such as Michael's keydefs, e.g.:

Map B-PDF:

<map>
  <title>Map B PDF-specific keys</title>
  <keydef keys="topic-02"
   href="/workspace/output/map-b/pdf/map-b.pdf#unique-01"
   format="pdf"
   scope="external"
  />
<map>

That's as good as any other way to capture the information and I'm happy to
stipulate that this is how it is always captured for the purpose of
processing interchange. This leaves open the possibility of manual or
automatic inclusion of the map into the publication map as I think Michael
is describing in his processing model. How the map is used is an
implementation detail if the map is not literally included by a map author
separate from a specific rendition process action.

When Map A is rendered the questions then are:

Question 1. What should the keydef for key "topic-02" look like in Map A?

My proposal is currently:

<keydef keys="topic-02"
  href="map-b.ditamap#keyname::topic-02"
  format="ditamap"
  scope="peer"
/>

Where the fragment identifier is a strawman for a fragment ID that is
unambiguously a reference to a key in the scope of the key space defined by
root map map-b.ditamap.

Michael's example is:

<mapref processing-role="resource-only" href="map-b.ditamap"/>

If I understand Michael's approach, he is simply including Map B as a
resource-only map so that the keys have a binding. However, his form of
inclusion doesn't make it clear that the intent is that those keys are
treated as a separate key space. I think that is essential. That is the
intent of my using scope="peer". It doesn't keep the two key spaces separate
and therefore requires that the key names not conflict between the two root
maps.

His approach does allow swapping in of the rendition-specific bindings for
Map B given the map-as-authored-to-rendition-map mapping stipulated above as
a necessary parameter to the rendition process. But it still requires a
single unified key space across maps A and B.

In the context of processing Map A as authored outside the context of a
specific rendition there would be nothing to indicate that map B's keys are
not defining resources directly required by Map A. For example, a process
that takes a map and produces a package of all of Map A's dependencies would
also gather up everything used by Map B even though they're not really
direct dependencies of Map A. (Such a processor is part of the open-source
DITA for Publishers project and is also in the Open Toolkit.)

If the mapref specified scope="peer" that would avoid the dependency
confusion but wouldn't avoid the key space combination because there's no
separate direct binding of key in Map A to key in Map B as in my approach.

In both cases we're pointing to the map defining the keys, the difference in
my approach is that I'm also pointing to the key within the map and using
@scope to make it clear that I'm not simply using Map B's key definitions to
include resources as part of Map A's content, which is otherwise the
implication per the DITA 1.2 rules.

In my proposal, because there's an additional layer of indirection between
the key as referenced in the context of Map A and the key as referenced in
the key definition in Map A, the key names need not be coordinated between
the two maps. That is, if Map B defined the key for Topic 2 as
"second-topic", my form of keydef could be:

<keydef keys="topic-02"
  href="map-b.ditamap#keyname::second-topic"
  format="ditamap"
  scope="peer"
/>

And the original reference from Topic 1 would continue to work in both Map A
and Map AB.

I think that even if we don't address the keys via fragment ID that we have
to distinguish references to peer and external key sets.

Question 2. How does the agent (person or processor) rendering Map A specify
which rendition of Map B some or all of the links to Map B should point to?

That is, given that there is both a PDF rendition and an HTML rendition of
Map B, the choices are:

- The PDF rendition

- The HTML rendition

- Both renditions (multiple links generated from a single source link, or
some intermediate fan-out link or whatever).

Does this decision need to be made on a per-link basis or on a per-rendition
of Map A basis?

My thinking to date had been that like would always link to like, but
Michael is correct to say that that can't be the only option, so it has to
be either a build-time decision or an authoring-time decision.

I think it needs to be a build-time decision determined by how you define
the mapping of map-as-authored to rendition-specific map. Anything else
would require additional per-key-definition syntax or metadata conventions
that I think would be impractical in practice. I suppose if it came to it,
you could modify the rendition-specific map to reflect exactly what you
wanted and maintain that manually.

Another fact, which I never stated but that Michael correctly pointed out,
is that a given rendition is not identified just by the rendition type (PDF,
HTML, etc.) but by all the runtime parameters that define it, including the
active DITAVAL conditions, any processor-specific runtime options, the
rendition-specific key-to-address mappings, etc.

In my model, there is a Processing Manager that manages all rendition
processing applied to a set of known content, e.g., all the publications
managed within a given system. The Processing Manager abstracts the notion
of "rendition" through a Rendition Definition object, which captures all the
input parameters for a given rendition, e.g. "PDF, DITAVAL
platform="windows", PDF option set "foo".

The implication of Rendition Definition is that the same input rendered
using the same Rendition Definition will produce the same (or functionally
equivalent) output.

Michael's model assumes there is no Processing Manager but that processing
happens where it happens and people coordinate however they do it. However,
the abstract notion of Rendition Definition is the same: you have to now
what all the parameters were in order to reproduce the rendition. So in
Michael's distributed world the Rendition Definition might be implemented as
notes scribbled on your desk blotter or an email from the supplier of the
rendition you want to link to, or whatever, but the information content is
the same regardless.

We can now define a "rendition instance" as being a Rendition
Definition/input map pair. Two different input maps that use the same
Rendition Definition will have "consistent" or "compatible" output (that is,
they'll reflect the same set of runtime options).

This is all, I think, equivalent to Michael saying "the person who renders
the map has to specify the appropriate DITAVAL files, rendition-specific key
bindings, and on".

So my notion of Rendition Definition is either literal, as in my Processing
Manager system, or virtual, reflected in the knowledge of the person doing
the rendering, but in both cases, the same information is represented.

In my model all rendering is done by the same processors, so that
coordination of intermediate data (key-to-rendition-location mappings) is
obviously easy to do.

But Michael say "no, you can't assume that--it has to be more disconnected
and distributed", which is true.

But I think the degree of distribution becomes an implementation detail.
That is, if Rendition Definitions include the key-to-rendition bindings,
it's only a question of how those bindings get communicated among processing
systems, not how they are captured or represented. Michael presumes or
stipulates a map-based syntax because that is reliably interchanged and
processed by DITA processors, which is fine.

So now to the processing:

If we stipulate that rendition-target-type is a runtime parameter, then when
I process Map A to a particular rendition and want links to be to the PDF
renditions of the target publications, part of the Rendition Definition is
"render cross-publication links to PDF renditions". But in fact, it needs to
be "Render cross-publication links to the rendition created using Rendition
Definition X", that is, a specific Rendition Definition reflecting a
specific set of rendition options, not just the base output type.

In the context of the Open Toolkit, this means all the Ant parameters plus
all the Toolkit Plugins and environment variables that contribute to the
configuration of the transformation type used. Any other processing system
will have the equivalent set of options and starting conditions.

Given this background, we can now explore the different processing use
cases:

Processing Use Case 1: Don't have rendition-specific key bindings for Map B.

If I process Map A and I don't have the rendition-specific key binding for
Map B, the processor has three choices:

1. Process Map B using the Rendition Definition specified and then use the
result to complete processing Map A. Note that this could be a literal
process or it could be "get on the phone to the supplier of Map B and ask
for the rendition-specific key binding that reflects the Rendition
Defininition you want".

2. Process Map A with placeholder or otherwise unresolveable links.

3. Fail the rendition of A.

Processing Use Case 2: Do have rendition-specific key bindings for Map B.

If we are using my cross-publication key definition approach, then there
needs to be an association between the root map map-b.ditamap and the
corresponding set of rendition-specific keys. Abstractly this is part of the
Rendition Definition parameters: you simply say "for map file
"map-b.ditamap" use key definitions "map-b-PDF.ditamap" or whatever. It
could also be done by literally change "map-b.ditamap" to
"map-b-PDF.ditamap" in the map source before processing it normally. In any
case, given that association, the processor can resolve references to keys
nominally defined in map-b.ditamap to the keys as bound in the
rendition-specific binding.

Using Michael's approach it's essentially the same: you define the mapping
as a rendition parameter or otherwise modify the map to be processed to
replace "map-b.ditamap" with "map-b-PDF.ditamap".

In the context of the Open Toolkit this would be something done as part of
the general preparation of the intermediate files used to then create the
final rendition (e.g., as part of the map-pull process or whatever makes
sense). The data manipulation required is consistent with the sort
manipulation the Toolkit already does.

In the context of Processing Manager the mapping might be hidden behind a
key resolution API that takes the rendition-specific key definitions into
account.

In any case, the result is the same--the rendered links reflect the bindings
defined in the rendition-specific key map.

The only difference is the interaction or potential interference of the two
key spaces.

In my approach, as explained above, there's no possible interference of the
two key spaces because they are kept distinct, while in Michael's approach
they key spaces are combined.

I'm sure there's more to say on this subject but I'm out of time for now.
But I think I've made my point about as clearly as I can.

Cheers,

E.

--
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 512.554.9368
www.reallysi.com
www.rsuitecms.com
References:
- Re: [dita] Scenario for cross-deliverable referencing
  - From: Michael Priestley <mpriestl@ca.ibm.com>