Subject: Re: [dita] Cross-Deliverable Links and Key Resolution

I strongly object to the idea of a blanket rule stating that peer-map key
scopes are always lower priority than local key scopes. There are valid
use cases for having the peer map be both a higher and a lower priority,
and so we should allow map authors to express - via relative positioning
of peer maps and local keydefs - the precedence they want.

A map author might want to define a local override for a key defined in a
peer map that always applies, regardless of the availability of the peer

Another map author might want to provide local definitions for keys from a
peer map, but place them such that they *only* apply if the peer map is

We should allow for both.


On 12/9/14, 10:49 AM, "Eliot Kimber" <ekimber@contrext.com> wrote:

>You are correct that this discussion is only with regard to construction
>of key spaces in the linking map: no other processing of the peer map is
>expected or intended.
>You make an interesting point about a peer map that itself has a reference
>to another peer map: what are the implications of that?
>I think the implication is that a key defined in a peer map could itself
>be bound to a key in another peer. It also means that you can't know the
>complete root key space for a given peer map without also processing its
>peers *in the case where a peer map could override locally-defined keys*.
>In the case where peer maps *cannot* override locally-defined keys or
>*always* override locally-defined keys, then the peer map is not relevant.
>I think that's why I didn't consider this case: I had either assumed peer
>maps implicitly define all keys in their scope or that peer maps would
>always have a lower key definition precedence.
>So maybe the answer is just that: peer maps always have the lowest key
>definition precedence with respect to all local maps. That avoids the
>issue of knowing if locally-defined keys with the same scope qualification
>as peer map are or are not effective: they would always be effective.
>That leaves only knowing if a given peer-scoped key reference is or is not
>resolvable, which is, I think, a less serious issue, because processors
>can be allowed to choose how they handle the case: report the key as
>undefined and treat it as such or report the key as currently unresolvable
>but not undefined (using some placeholder link text, for example), that
>is, something consistent with Chris' original proposal for how to handle
>the undefined peer key case.
>In addition, I think my position of yesterday where I said that the peer
>map *must* be available was ultimately misguided. Thinking it through
>again, I realized that in fact you can never be sure if a given peer
>reference will actually work, even after you've produced the final
>deliverable using the key-to-anchor map you have, so ultimately it doesn't
>matter if you do or don't have the peer map available: the correctness of
>any given link will depend entirely on the target deliverable that's
>actually available: any information you have before final deliverable is
>potentially incorrect and there's no way around that in the general case
>(see below for a deeper exploration of that aspect of cross-deliverable
>[More thought background:]
>The problem I was concerned with when thinking about this yesterday was
>that you could get different results when processing with and without
>knowledge of the peer map's key space: without the key space available,
>you might determine that a lower-precedence local key is effective but
>later, when you process with knowledge of the peer key space (or, more
>realistically, with the deliverable-specific key definitions for the peer
>deliverable), the peer key becomes effective. And the reverse is possible
>as well: if we used the "peer scope matches all possible keys" rule, then
>without the peer map available, you would treat the lower-precedence key
>as not effective, but later, if the key is not actually defined in the
>peer map (that is, not reflected in the deliverable-specific keys for the
>peer deliverable), it becomes available.
>But in the face of a cascade of peer references, I think this potential
>variability is unavoidable because otherwise the processing requirements
>would be too high.
>Also, it's possible avoid the problem by making all your peer map
>references have lower precedence than any other key definition (which
>might require some gymnastics with sequences of nested maps that
>ultimately include the peer map references, but certainly doable).
>That suggests that we need something closer to Chris' original proposal,
>which was less absolute than my suggestion from yesterday.
>Note that we're only concerned with results of non-final processing
>(meaning processing done with complete knowledge of the
>deliverable-specific key-to-anchor mapping).
>For final deliverable-specific processing, where you have the
>key-to-anchor mapping, the key processing will be consistent and
>repeatable: any two processors should give exactly the same key resolution
>result (meaning there is no ambiguity about which key definitions are
>So I think the question comes down to what processors are allowed to do or
>required to do for non-final processing.
>I think the options are:
>1. Require processors to treat non-available peer maps as completely
>missing for purposes of key definition precedence determination (that is,
>is a local key with the same scope qualification effective or not
>effective or is a peer-scope-qualified key defined or not defined). Allow
>(but not require) processors to construct peer key spaces when the peer
>map is available. In this case, allow processors to ignore any peer map
>references within the directly-addressed peer map (that is, let them avoid
>infinite regress).
>2. Allow (but not require) processors to treat peer scopes as matching
>all-possible keys in the scope. Also allow processors to construct peer
>key spaces as for option (1).
>Both options may produce different results for non-final processing and
>final processing but option (2) imposes less processing overhead (because
>you never have to worry about constructing the peer key space).
>At the end of the day cross-deliverable addressing is always potentially
>incomplete because the two deliverables may be produced asynchronously and
>therefore might be out of sync: even if you have what you think is a
>correct set of key-to-anchor mappings for a given target deliverable, it
>might not reflect the actual deliverable. There's simply no way around
>that in the general case.
>Thus, anything we put in the spec to try to ensure correctness of
>cross-deliverable links is ultimately doomed to fail.
>So I think whatever rules we define have to both recognize this inherent
>unreliability and give processors the flexibility to be as rigorous or
>light-weight as they choose as long as the implications for a given
>processing choice are clear.
>[End thought background]
>>I am still a little uncomfortable with some of the statements expressed
>>below, but I came up with a couple statements that may / may not be
>>correct to at least help me get closer to comfort.
>>Peer maps with no keyscope (all maps defined with DITA 1.2 and prior). In
>>this case "any" key not defined somewhere else "may" be defined in the
>>peer map. So any message I think would be the same as before DITA 1.3, no
>>need to say it "could" be in a peer map. (One thing is that if there is
>>fallback markup in a key reference, then no message would be output in
>>any case). I don't know if it would be good to note in a composition that
>>"peer" maps were found in this case. If we were to keep to our definition
>>from DITA 1.2 then we would not report this,  from the DITA 1.2
>>definition of peer in scope.
>>Peer maps with keyscopes (DITA 1.3) may need to have new behavior in that
>>they need to be accessible (in some way, either in fact, or by proxy) for
>>key space processing. Since this is "new" to DITA 1.3 I think that might
>>be acceptable as a "should". If there is a keyscope then at least we
>>could possibly be able to know if a message be issued if a keyref could
>>not be resolved (because it would have a scope identifier). However, if a
>>key is resolved by some other reference found in a map or submap with a
>>scope-identifier but "might" have been defined in a peer, I don't think a
>>message should be output. I see a lot of noise that could be generated.
>>Peer maps that have submaps that are local/peer also need to be
>>processed, so this rule keeps going right? Peer maps may now have their
>>own whole processing tree, with keys, more keyscopes, etc. Again, only if
>>a keyscope is set on a peer map would we want to provide messages, etc.
>>about any potential keyref not found. One thing that could happen is that
>>the "peer" map could define a keyscope for the local map we are
>>processing since they are peers of each other. That peer map would want
>>to have references to our "local" map, probably with some sort of
>>keyscope. I see possible circular references happening with this, but
>>perhaps can be sorted out.
>>And, we are defining this for peer map processing of key spaces only,
>>correct. Peer still means we don't normally process the peer map during
>>normal processing of topics, correct? Or is that "scope" also expanded?
>>It seems that if peer maps include key references to items in the peer
>>map, then a new category of messages might now be "you do not have a
>>topic defined in local map, but in a peer map". That may be OK also, but
>>there are certainly some processing implications when you open this up.
>>These may be non-issues as this is still swirling in my mind, but thought
>>I should put them down.
>>In thinking this through I've come to the conclusion that Chris' position
>>is the correct one: there needs to always be a peer map that defines the
>>keys defined in that map (or more abstractly, access to the peer map's
>>root key space). If this is the case then normal key processing rules
>>apply at all times and there's no need to have any special rules about
>>key name matching.
>>I think the best way to express this in the specification is in terms of
>>key space construction:
>>If the peer map's root key space is not available:
>><li>The processor MUST behave as though there was no peer map reference
>>for the purpose of constructing the local key spaces. Processors SHOULD
>>report locally-defined keys that could be overridden by keys defined in
>>the peer map</li> <li>References to keys that would only be defined in
>>the peer key space MUST be treated as undefined.</li> </ul> <note>The
>>practical implication of this rule is that the owners of peer maps should
>>provide at least a set of key definitions for those keys available in the
>>peer map even if the peer map itself is not or cannot be made available
>>(for example, for security reasons). If the owner of the peer map does
>>not make the full map or key definitions available, the author of the
>>linking map should create a set of key definitions to serve as a "proxy"
>>for the peer map.
>><note>When peer maps are referenced from the local map such that keys
>>defined in the peer map could override keys with the same scope
>>qualification defined in the local map, processors must have access to
>>the peer map's root key space in order to accurately construct the local
>>key spaces.
>>A locally-defined key can be overridden by a peer map if the
>>locally-defined key is defined in a submap with a lower key definition
>>precedence than the peer map. For example, in this map:
>>  <mapref scope="peer"
>>    keyscope="book-02"
>>    href="../book-02/book-02.ditamap"
>>  />
>>  <mapref scope="local"
>>    keyscope="book-02"
>>    href="book-02-overrides.ditamap"
>>  />
>>  ...
>>The peer map <filepath>book-02.ditamap</filepage> is included before the
>>local submap  <filepath>book-02-overrides.ditamap</filepath>. This gives
>>it higher import precedence, meaning that any keys it defines would
>>override any keys defined in the
>><filepath>book-02-overrides.ditamap</filepath> map (which has the same
>>scope name as the peer map). But a processor cannot determine if any keys
>>defined in the <filepath>book-02-overrides.ditamap</filepath> map are
>>overridden unless it knows what the key space for the peer map is.
>><p>If the order of these two maps is reversed:
>><mapref scope="local"
>>  keyscope="book-02"
>>  href="book-02-overrides.ditamap"
>><mapref scope="peer"
>>  keyscope="book-02"
>>  href="../book-02/book-02.ditamap"
>>  />
>>Then the local map <filepath>book-02-overrides.ditamap</filepath> has a
>>higher key definition precedence and therefore any keys defined in that
>>map would override the same keys defined in the peer map. In that case
>>the peer map key space is not needed in order to determine the effective
>>key definitions in the local map's key spaces.
>>>That's not quite what I'm saying, and I feel pretty strongly that we
>>>shouldn't special-case peer map scope precedence. What I'm trying to get
>>>at is this:
>>><mapref keyscope="otherPub" scope="peer" href="pub.ditamap"/>
>>><xref keyref="otherPub.someKey"/>
>>>Here is how I would want my processor to behave:
>>>* If pub.ditamap *can* be loaded, and the map does not contain a
>>>definition for 'someKey', processors should issue a WARNING or an ERROR
>>>stating that no definition for the referenced key could be found.
>>>* If pub.ditamap can be loaded and it contains a definition for someKey,
>>>huzzah, move on.
>>>* If pub.ditamap cannot be loaded, and no other definition for
>>>otherPub.someKey is found, processors should issue a MESSAGE or *maybe*
>>>WARNING (*NOT* an ERROR) stating that the key could not be resolved, but
>>>may be defined in the peer map. (This, I think, is where we differ.)
>>>* If pub.ditamap cannot be loaded but the local map contains a
>>>for otherPub.someKey, huzzah, move on.
>>>On 12/4/14, 4:10 PM, "Eliot Kimber" <ekimber@contrext.com> wrote:
>>>>I think I see what you're getting at.
>>>>Applying the normal key definition precedence rules, if the map is as
>>>>show it:
>>>><mapref scope="peer" keyscope="otherPub" href="otherPub.ditamap"/>
>>>><keydef keys="otherPub.about" href="aboutThatOtherPub.dita"/>
>>>>Then if the peer map also defines the key "about" that will override
>>>>local keydef (by normal key definition precedence) but if it doesn't
>>>>define it, then the local one would be effective.
>>>>On the other hand, if the map was:
>>>><keydef keys="otherPub.about" href="aboutThatOtherPub.dita"/>
>>>><mapref scope="peer" keyscope="otherPub" href="otherPub.ditamap"/>
>>>>Then the first keydef clearly takes precedence over the peer scope,
>>>>by normal key precedence rules.
>>>>That suggests that the rule has to be that peer scopes take precedence
>>>>over any later-defined key with the same scope qualifier. One way to
>>>>this is that a peer scope implicitly defines *all possible* keys in
>>>>scope. That resolves the ambiguity in a clear and simple way. We can
>>>>say explicitly that processors may issue a warning in this case,
>>>>it could be that the map author really did want to override that one
>>>>and so they need to move it before the peer scope. That also suggests
>>>>as a matter of best practice, peer scope maprefs should go last in the
>>>>The only other option I can think of is actually requiring that the
>>>>maps be present at processing time and that was never our intent and
>>>>only be necessary to handle this case. It would set up a "Simon says"
>>>>requirement where you might be forced to create maps that declare
>>>>peer keys you decide to reference just so that the processor won't
>>>>complain, where you would otherwise not ever need the actual peer maps
>>>>because you're otherwise just using the key-to-address maps provided
>>>>the target publications.
>>>>>There is no way to tell whether a key reference is to a peer map
>>>>>looking at it. You can tell if it specifies a scope qualifier that is
>>>>>mapped to a peer map, but that's not a 100% guarantee that the key
>>>>>appear in that peer map.
>>>>><mapref scope="peer" keyscope="otherPub" href="otherPub.ditamap"/>
>>>>><keydef keys="otherPub.about" href="aboutThatOtherPub.dita"/>
>>>>>If, in a map containing the above, you see keyref="otherPub.intro",
>>>>>otherPub.ditamap isn't available, there is *no way* for any processor
>>>>>tell with 100% certainty whether it's a reference to a key that should
>>>>>defined in the other publication, and so shouldn't be reported on, or
>>>>>reference to a missing key definition in the 'local' keyscope
>>>>>and so should be called out.
>>>>>You can only know if a given key is or is not resolvable when you
>>>>>the final deliverable of the document doing the linking: between the
>>>>>you author the link and when you produce the final form, anything
>>>>>happen to the target document.
>>>>>For all these reasons, peer key references simply have to be ignored
>>>>>the purpose of determining whether or not a key is resolvable as long
>>>>>you're not producing the final-form deliverable.
>>>>>At which point it's too late. There simply must be a way to validate
>>>>>keyrefs prior to final publication. If a keyref is not resolvable
>>>>>the local key scope structure, but looks like it could refer to a key
>>>>>an unavailable peer map, processors should say something. I don't
>>>>>they should necessarily say the same thing they say for unambiguously
>>>>>unresolvable keyrefs (which is, I think, what the current language
>>>>>mandates), but they should say *something*.
>>>>>On 12/4/14, 1:45 PM, "Eliot Kimber" <ekimber@contrext.com> wrote:
>>>>>>I think there may be some confusion about how I intended
>>>>>>references to be processed. This was captured in the discussion that
>>>>>>Michael and I had about how to implement cross-deliverable link
>>>>>>but since that concept didn't get included in the 1.3 spec I think
>>>>>>people have not paid attention to it.
>>>>>>I think the relevant aspect of cross-deliverable linking for this
>>>>>>discussion is that the facility as specified explicitly does not
>>>>>>that you know for sure that given peer key will actually be defined
>>>>>>the time you author the link*. The reason for this is that the
>>>>>>publications involved may be developed and produced asynchronously
>>>>>>with little coordination. Thus the keys you want to link to may not
>>>>>>fact have been literally defined at the time you author the links.
>>>>>>You can only know if a given key is or is not resolvable when you
>>>>>>the final deliverable of the document doing the linking: between the
>>>>>>you author the link and when you produce the final form, anything
>>>>>>happen to the target document.
>>>>>>In addition, if you use the generic key-based implementation approach
>>>>>>Michael and I developed, all references to peer keys become local key
>>>>>>references when you produce the final deliverable so normal key
>>>>>>rules apply during that final deliverable production process.
>>>>>>For all these reasons, peer key references simply have to be ignored
>>>>>>the purpose of determining whether or not a key is resolvable as long
>>>>>>you're not producing the final-form deliverable.
>>>>>>The reason that there is this distinction between production of the
>>>>>>final-form deliverable and any other processing you might be doing is
>>>>>>because resolving cross-deliverable links requires a multi-pass
>>>>>>conceptually and that's how a lot of processors will implement it. In
>>>>>>particular, it is possible to have any amount of time elapse between
>>>>>>you do pass 1, as described below, and when you do pass 2: there is
>>>>>>requirement that they be performed together in time. Therefore I
>>>>>>can reasonably expect that most processors will actually reflect
>>>>>>pass in their implementations.
>>>>>>The passes are:
>>>>>>Pass 1: Each publication involved in cross-deliverable linking is
>>>>>>processed once to determine, *for that publication*, what deliverable
>>>>>>anchors any keys become for that deliverable. This mapping of
>>>>>>keys-to-deliverable-addresses is saved for use in subsequent passes
>>>>>>was the details of how this data could be saved that Michael and I
>>>>>>discussed and arrived at the proposed interchange solution of using
>>>>>>intermediate key definitions).
>>>>>>For example, if topic "topic-01.dita" is referenced by the topicref:
>>>>>> <topicref keys="chapter-01" href="topic-01.dita">
>>>>>>in the map and if for HTML output the result is HTML file
>>>>>>"chapter-01.html", then the deliverable-specific key-to-anchor
>>>>>>would be "key 'chapter-01' maps to HTML file 'chapter-01.html'" for
>>>>>>deliverable. This mapping can be represented by a normal key
>>>>>>the form:
>>>>>><keydef keys="chapter-01"
>>>>>>  href="../../publication-02/chapter-01.html"
>>>>>>  scope="external"
>>>>>>  format="html"
>>>>>>Pass 2: Each publication involved in cross-deliverable linking is
>>>>>>processed again, this time using the deliverable-specific
>>>>>>mappings for each of the target publications to resolve any key
>>>>>>to those publications.
>>>>>>Note that pass 1 does not *require* that any target peer maps be
>>>>>>because you're only concerned with keys within each publication (that
>>>>>>generating that publication's key-to-anchor map).
>>>>>>It is not until pass 2 that the processor has to be able to resolve
>>>>>>cross-deliverable keys and that is the point at which failures can
>>>>>>should be reported.
>>>>>>Note also that there is an inherently loose coupling between these
>>>>>>phases: in the general case you don't know when or if any given
>>>>>>deliverable will itself be available and therefore you don't
>>>>>>know during pass 1 processing if a given key will or won't actually
>>>>>>resolvable when you go to do pass 2. You might have authored links to
>>>>>>key that you expect will be defined but doesn't happen to be defined
>>>>>>the target publication at authoring time. As long as that key is
>>>>>>and resolvable when you do pass 2, it's all good.
>>>>>>Thus, there can be processing contexts in which it is not known, and
>>>>>>doesn't need to be known, that a peer key reference can't be
>>>>>>namely the pass-1 processing for each publication.
>>>>>>However, *if the peer maps are available*, processors certainly can
>>>>>>the key definitions if they choose to and report the issue. But how
>>>>>>manage your related publications relative to each other and the
>>>>>>of deliverables is entirely a business decision: you could impose
>>>>>>tight controls or very loose controls, depending on what you need.
>>>>>>The DITA-defined aspects of the process accommodate both loose and
>>>>>>For that reason, we cannot state the rule for peer keys as "if you
>>>>>>resolve the key it is treated as an unresolvable key" because there
>>>>>>now valid processing contexts where you simply don't know if the key
>>>>>>is not resolvable.
>>>>>>I think the rule has to be stated in terms of producing final
>>>>>>deliverables: at that point, the normal unresolvable key rules should
>>>>>>But, there's more:
>>>>>>The general mechanism Michael and I arrived at uses intermediate key
>>>>>>definitions as the way of capturing the key-to-anchor binding, as
>>>>>>The basic idea is that in pass 1 you generate a set of key
>>>>>>that reflect the key-to-anchor binding for the deliverable you're
>>>>>>creating. These keys are declared as scope="external" and with a
>>>>>>reflecting the target deliverable (e.g., format="html", format="pdf"
>>>>>>whatever it is).
>>>>>>In pass 2, each publication that links to that deliverable literally
>>>>>>includes those keys before any locally-defined keys so that the
>>>>>>deliverable-specific keys take precedence.
>>>>>>In this scenario, during pass 2 processing the key definitions are
>>>>>>local to the publication making the cross-deliverable link, not peer,
>>>>>>so normal key processing rules apply: either the key is defined and
>>>>>>all good or it's not and normal undefined key rules apply.
>>>>>>Given this implementation approach, it should be clear that
>>>>>>should ignore peer key references, at least for the purposes of
>>>>>>unresolvable key rules, because they can't know for sure if the key
>>>>>>is not resolvable in the general case.
>>>>>>However, DITA users can choose to impose a rule that all peer maps
>>>>>>available during pass 1 processing and that they should reflect the
>>>>>>set of keys that will be available in that publication. This is the
>>>>>>"tightly controlled interlinked publication set" use case, e.g., what
>>>>>>might be provided by a CCMS that manages the authoring and
>>>>>>all the publications in a related set, enforcing specific business
>>>>>>for release and publication. (This was the use case I typically had
>>>>>>mind when thinking about this problem, e.g., the "all-knowing
>>>>>>process manager".)
>>>>>>In that case processors can check the resolvability of peer key
>>>>>>early and report them or treat them as unresolvable during pass 1 (or
>>>>>>some appropriate workflow checkpoint where it is required that all
>>>>>>be resolvable). But that is an implementation and business rule
>>>>>>that is not inherent in the cross-deliverable link mechanism and that
>>>>>>cannot be mandated by the standard.
>>>>>>Note that Michael had a completely different and equally-valid use
>>>>>>mind: the "disconnected and lightly coordinated interlinked document
>>>>>>where publications that link to each other are managed by different
>>>>>>with very little direct coordination other than the interchange of
>>>>>>key-to-anchor maps necessary to produce publications that link to
>>>>>>In the context of Michael's use case, it should be clear that trying
>>>>>>enforce key resolvability during pass 1 is simply not generally
>>>>>>in some cases, not possible, because you simply don't have the
>>>>>>key-to-anchor mapping during initial authoring or maybe not until you
>>>>>>final deliverable generation for publication.
>>>>>>In this disconnected case you might expect owners of documents to
>>>>>>interchange maps that provide just the key definitions to which other
>>>>>>publications are allowed to link. In that case, early validation of
>>>>>>references would be possible. But again, this level of coordination
>>>>>>required by the facility as specified or intended.
