dita message

Subject: Re: [dita] chunking description for spec
From: "W. Eliot Kimber" <ekimber@innodata-isogen.com>
To: <dita@lists.oasis-open.org>
Date: Tue, 13 Feb 2007 06:15:37 -0600
A general comment, reflected in my detailed comments below: the DITA 
spec should never use "file" when "XML document" is meant (and since 
files are not part of any standard DITA is defined in terms of, should 
only used in the context of non-normative examples). Likewise, "file" 
should never be used where "storage object" is meant.

> *Reuse of a nested topic* – A content provider creates a set of topics 
> as a single module. A reuser wants to incorporate only one of the nested 
> topics from the module. 

Change "module" to "document". In this context the term "module" has no 
DITA-defined meaning and could be confused with "module" as it is used 
in the context of specialization. (A search on "module" in the latest 
architecture spec shows that "module" is only used in the context of 
specialization modules".)

   The reuse can reference the nested topic from a
> DITA map, using the chunk attribute to specify that the topic should be 
> produced in its own file.

I'm not sure why "chunk" is needed in this case--if I'm re-using a 
topic, regardless of where it occurs within its containing document, it 
will be processed without regard to its *parents*.

That is, while there may not be a default *value* for chunk, there 
absolutely is a default chunking behavior and it is definitely not 
"select-document" (I'm not even sure select-document is a good idea but 
that's a discussion for another day).

That is, in the case where the parent topic of the directly-used topic 
is not otherwise included in the map (or is not included via a parent 
topicref of the topicref using the directly-used nested topic) then the 
parent topic should not be involved at all, unless the chunk value is 
*explicitly set* to "select-document".

> *Use of the chunk attribute*
> 
> When a set of topics is transformed for output using a map, the map 
> author may use the chunk attribute to override the implementation 
> specific default behavior. 

I think a pointer to the implementation-specific stuff at the end of 
this discussion would be useful here--until I got there it was not clear 
what "implementation specific" meant.

   The chunk attribute allows the map author to
> request that multi-topic files be broken into smaller files, and that

change to: "request that multi-document XML documents be broken into 
multiple XML documents"

> multiple individual topics be combined into a larger files.  

change to: combined into larger XML documents.

> *by–topic – *When the chunk attribute value includes the “by–topic” 
> token, a chunking policy is established for the current topicref element 
> where a separate output document is produced for each source topic in

Here I think "document" should be "chunk" given that the details of the 
outputs cannot be known in this statement and therefore may or may not 
be documents in the XML sense (or any other sense).

> the referenced document. 

change: "topic in the referenced document" to "child topic of the 
referenced topic". There is no sense in which topic-containing documents 
can be referenced--only topics can be referenced using topicref 
format="topic".

   The policy only applies for a chunk action of
> the current element (for example, to-content), except when it is set on 
> the map or map specialization element, when the “by-topic” policy is 

"or map specialization element" is unneeded--it is sufficient to say 
"the map" as that implicitly includes any specializations of map.

> *by–document – *When the chunk attribute value includes the 
> “by–document” token, a chunking policy is established for the current 
> topicref element where a single output document is produced for the 

"output chunk"

> Some tokens or combinations of tokens may not be appropriate for all 
> output types. When unsupported or conflicting tokens are encountered 
> during output processing, warning and error messages may be produced. 
>  Recovery from such conflicts or other errors is implementation dependent.

Change "may" to "should"

This next paragraph is way too implementation dependent (it assumes a 
file-based storage system for example). I would make it a more qualified 
note:

add: "NOTE: When creating new topics via chunk processing, the storage 
object name or identifier (if relevant) is taken from the copyto 
attribute if set, otherwise the root name is taken from the id attribute 
if the by-topic policy is in effect and from the name of the referenced 
document if the by-document policy is in effect.


> *Examples*
> 
> Given several single topic files, parent1.dita, parent2.dita, …, 

c/files/documents/

> child1.dita, child2.dita, …, grandchild1.dita, grandchild2.dita 
> containing topics with ids P1, P2, …, C1, C2, …, GC1, GC2, …., several 
> nested topic files, nested1.dita, nested2.dita, …, each containing two 

c/files/documents/


> Produces a single output file, P1.xxxx, containing topic P1 and topics 

c/file/chunk/ and throughout


> 
> For use in the %topicref-atts; and %topicref-atts-no-toc; descriptions 
> in Chapter 23 of the DITA 1.1 Language Reference Specification:
>  
> *Name* 	*Description* 	*Data Type* 	*Default Value* 	*Required?*
> chunk 	When a set of topics is transformed using a map, the chunk 
> attribute allows multi-topic files to be broken into smaller files, and 

c/files/documents/

> multiple individual topics to be combined into  larger combined files.

c/files/documents/

Cheers,

Eliot
-- 
W. Eliot Kimber
Professional Services
Innodata Isogen
8500 N. Mopac, Suite 402
Austin, TX 78759
(214) 954-5198

ekimber@innodata-isogen.com
www.innodata-isogen.com
References:
- chunking description for spec
  - From: Michael Priestley <mpriestl@ca.ibm.com>