[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [dita] Unique topic ids in the cross publication or global CMS use case
Thanks Eliot That is pretty clear From your power point the text below is crystal clear. In the proposal 13041 you have to work harder to get it - you define location as authored and location as delivered - but the power point is more clear. I think the DITA 1.3 needs to express something about "no requirement of topicids to have global uniqueness " because, as you say there is a misconception on this and we are not a strictly normative spec. ******************** your text from powerpoint *************** Addressing within the content as authored: Defined by the source format, e.g., DITA XML For XML source, should be independent of any given output format DITA defines the rules for addressing within DITA XML Addressing from the publication as delivered: Defined by the delivery format: PDF, HTML, EPUB, etc. No single standard Details may be proprietary ************************************************************ > -----Original Message----- > From: Eliot Kimber [mailto:ekimber@rsicms.com] > Sent: June-16-13 6:10 AM > To: Jim Tivy; dita > Subject: Re: [dita] Unique topic ids in the cross publication or global CMS use > case > > Jim, > > I think your concern is addressed by the current cross-deliverable addressing > proposal: it does in fact propose the use of keys and mappings from keys to > locations as delivered as the way to ensure reliable cross-deliverable addressing. > The proposal as documented should make it clear that processors are obligated > to manage a mapping from objects as authored to objects as delivered such that > any delivery constraints are not imposed back onto the authored content, for > example, making topic IDs unique within a publication. > > If the proposal is not sufficiently clear on that point then we must correct it. > Because I am so deeply into issues of linking and addressing I often forget that > what to me seems obvious is in fact not at all obvious. > > Perhaps it's useful to discuss the general issue of topics IDs and their non- > requirement for uniqueness in the context of addressing generally. I think there > is either some general misunderstanding in the community on what is and isn't > required and probably some poor implementation choices made long ago that > still linger in our community. I don't fault implementors for not always > understanding the subtleties of addressing--it's a challenging subject. > > ----------------------------------- > Topic ID Uniqueness Is Not Required > > Topic *IDs* are not required to be unique outside the context of their containing > XML document, nor do they need to be. > > However, topic document addresses *are* necessarily unique, because the XML > documents that contain topics are distinct storage objects, which means they > have a unique location within the storage system that contains them and that > storage system has a unique location within the set of all possible storage > locations. That's how storage systems work. > > In the world of the Web, every storage system exists on some kind of server > with a unique IP address. The storage system itself then exists at some unique > location within that server, and the resources managed by the storage system > then have unique locations, e.g., filenames, object IDs, or what have you. > > Thus, every *topic* has a unique URL/ID pair that distinguishes it from *all > possible other topics* in existence at any moment in time. > > Thus the ID of the <topic> element is necessary *only* to distinguish different > topics within the same *XML document*. But that requirement is imposed by > XML itself since DITA defines topic IDs as XML IDs. > > If an XML document consists of exactly one topic, then addressing the document > is sufficient to reliably address the topic (by the rules of DITA > addressing) and in that case the topic ID is only of interest for addressing > elements within the topic, because DITA fragment identifiers are > {topicid}/{elementid} pairs. But even there, the value "topicid" for all topic IDs in > this case is as good as anything. > > For the purposes of addressing in deliverables, there is no need for topic IDs to > be unique because the processor that generates the deliverable can ensure that > the IDs used in the deliverable are unique within that deliverable. The deliverable > is itself a storage object (or collection of storage objects) that, like all storage > objects, have identity within the set of all possible storage objects. > > In addition, the processor that produces the deliverable must be able to have the > information required to maintain the mapping from objects as authored (that is, > topic ID, element IDs, and keys) to their locations as delivered. This is true > because the processor must have both the original source and deliverable it > generated available to it--this does not mean that all existing processors were > implemented in such a way that this information is maintained, only that they all > *could have been*. > > So again, addressability is assured as long as the processor generating the > output generates unique IDs for any addressable things put into the deliverable > and maintains the source-to-deliverable address mapping. > > If you need to do cross-deliverable addressing then you need to have a mapping > from the locations (not just IDs) of the things as authored to the locations of the > things as delivered. That mapping could be managed in many ways but the > current cross-deliverable proposal does it through the use of keys and > intermediate key definition sets that map the keys as used in the content as > authored to the locations of the key-bound resources in the deliverable. That is > sufficient to support the requirement for addressability. > > In addition, the @copy-to attribute on <topicref> gives authors additional > control over deliverable addresses by allowing the assignment of new virtual > source storage object locations ("filenames") for distinct references to the same > topic or map. That doesn't remove the requirement for source-to-deliverable > address mapping, but it means that authors may influence the details of the > result. > > The DITA 1.2 spec doesn't say anything about topic ID uniqueness because it > doesn't need to. Topic IDs don't need to be unique, except as already required > by XML rules. > > It can be a *convenience* to assign unique IDs to the topics under your control, > but there is no way that any agency short of the divine can ensure global ID > uniqueness unless we mandate the use of a specific UUID generator. > > By the same token, there's nothing wrong with making your topic IDs globally > unique if you want to, it's just not necessary and could be a waste of effort. Or it > could be a useful simplifying strategy. A typical use case might be to make topic > IDs be object IDs of topics managed in a component content management > system. That's fine as long as everyone is clear that these IDs can at best be > unique within the scope of that one component content management system > instance (even if you're using some sort of UUID generator there's always the > chance, however remote, that somebody might randomly choose the same ID > for one of their topics). > > Cheers, > > Eliot > > On 6/15/13 8:08 PM, "Jim Tivy" <jimt@bluestream.com> wrote: > > > Hi Folks > > > > I have found numerous discussions that topic id is not required to be > > unique within a publication or collection of topics none of these > > discussions in the current 1.2 specification (that I could find > > anyhow) although omission means no requirement. > > One such reference was: > > http://tech.groups.yahoo.com/group/dita-users/message/14260 > > Of course topic id does have to be unique within an XML document > > that is not what I am talking about here rather I am addressing > > intra publication uniqueness or even global uniqueness. > > Some PDF processors, such as the PDF5 processor for Antenna House, > > however, require that topic ids do have to be unique within a publication. > > At first it seems like this requirement is overstepping what Oasis has > > recommended (or not recommended through omission). However, one > > reason for this unique id requirement of PDF5 is to support the cross > > publication linking use case. > > It just so happens that we dealt with this use case recently in > > approving proposal 13041 (Facility for key-based, cross-deliverable > > referencing (Kimber)). > > It seems if we do not recommend or say anything about unique topic > > ids, then we leave processors to ³twist in the wind² or make extra > > requirements like > > PDF5 did. On the other hand, if we require unique topic ids, we might > > be pre-supposing certain implementations which in fact are not necessary. > > It seems, however, if we are to add proposals such as 13041, then we > > might want to talk about how cross publication linking might happen > > this proposal > > 13041 opens the door to some new possibilities. > > > > For example, if our references were key rooted, we can used key export > > tables and the processors could do something like the following: > > > > I use a PDF example here but it may have bearing on other cross > > publication links such as cross chunked HTML. > > In PDF, for example, to allow processor defined unique ids to topics > > for the purposes of merge (Like PDF2 merge) then to link from PDFB to > > PDFA would require PDFA to export its external links to PDFB because > > the ids of the topics in the PDF are not known at author time. > > > > PDFA (export as XML) > > > > keyname newMergetopicId Original fragmentId > > MyKey1 a223345 be3333333 > > > > Then PDFB consumes this and has a reference to MyKey1/be3333333 > > > > Then when a processor builds PDFB and when it references PDFA with > > MyKey1/be3333333 it would resolve to PDFA.a223345/ be3333333 > > > > In this case, a223345 could be entirely generated by the PDF processor > > when PDFA is built, however, be3333333 would remain stable but not > > unique as a fragment Id. > > > > My question here is, should we say something in the spec or when we > > document proposal 13041 regarding this. Should we have text that says > > ³we DO NOT recommend processors rely on unique topic ids within a > > publication² or ³we DO recommend same². > > > > cheers > > Jim > > -- > Eliot Kimber > Senior Solutions Architect, RSI Content Solutions "Bringing Strategy, Content, > and Technology Together" > Main: 512.554.9368 > www.rsicms.com > www.rsuitecms.com > Book: DITA For Practitioners, from XML Press, > http://xmlpress.net/publications/dita/practitioners-1/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]