Re: [dita] DITA 2.0 thought: chunking

I think this is great, with some minor, vague reservations.

I worry about leaving it a CDATA attribute and allowing for proprietary tokens. Any proprietary tokens will break content interoperability at the DITA pre-processing level, and that makes me queasy. As a rule, the spec is very hands-off when it comes to how to render DITA content, but it’s as specific as possible in how to process content before display (and will be moreso in 2.0 when we describe a processing order). This helps ensure that the same content will work consistently (or at least consistently enough) across compliant tools. The use of any custom @chunk token ruins the ability to move content between tools, without a mechanism for somehow also describing the expected behavior in a standard way.

Also, part of me worries just a bit about losing the ability to select a topic without its children. I don’t have a compelling use case, but it still strikes me as a reasonable capability. That said, I wouldn’t want that ability encapsulated in the @chunk attribute. It should be its own attribute, something like select=”topic-structure|topic-only”.

Chris

From: <dita@lists.oasis-open.org> on behalf of Joe Pairman <joe.pairman@mekon.com>
Date: Wednesday, June 22, 2016 at 6:46 AM
To: Robert D Anderson <robander@us.ibm.com>, OASIS DITA TC List <dita@lists.oasis-open.org>
Subject: Re: [dita] DITA 2.0 thought: chunking

Robert,

Your suggestions make sense to me. I have used @chunk quite a lot, but only ever “to-content”. Both on topicrefs and on the root element of a map or ditamap.

I understand that there could be a case for bursting out topics too, including nested topics (i.e. the current “by-topic”), but beyond that I can’t think of much else.

Also, “merge” and “split” make sense as new, comprehensible values, and would avoid the blank looks I get when I tell people about “to-content”.

Cheers,

Joe

From: <dita@lists.oasis-open.org> on behalf of Robert D Anderson <robander@us.ibm.com>
Date: Tuesday, June 21, 2016 at 21:24
To: OASIS DITA TC List <dita@lists.oasis-open.org>
Subject: [dita] DITA 2.0 thought: chunking

In nearly every discussion about breaking compatibility in DITA 2.0, I've said that I was pretty sure chunking would change. Of course, I don't know how, and until recently hadn't tried to think too much about it.

So here are my thoughts: I think it should change, but not by as much as I thought initially.

DITA 1.0 defined the attribute, but did not specify any values. The definition was very closely tied to DITA-OT. DITA 1.1 formally defined values for @chunk (by-topic, by-document, select-topic, select-document, select-branch, to-content, to-navigation). However, the values were mostly defined by example - they were very hard for authors and implementers to understand or use.

Those definitions were clarified in DITA 1.2, and grouped by purpose. We recognized that the values defined in DITA 1.1 filled several roles:
* Ability to select topics -- "I want the whole document", "I want this topic and its children, ignore peers", "I want only this specific topic from the document, no children"
* Ability to split or combine documents
* Ability to control rendering of content or navigation

DITA 1.3 deprecated "to-navigation" but otherwise changed little.

Among my many issues with chunking:
- The token names do not make any sense. I can't imagine anybody has ever used them without looking up a definition.
- The tokens tie several functions together
- Instead of defining the attribute based on what people need, it was defined based on what might be theoretically useful / possible. As a result, I consider many tokens to be of some theoretical use but little practical use.

My suggestion for DITA 2.0:
- We should throw out the existing token values. The names don't make sense.
- We only define new values based on the core purpose of the attribute. In my experience, this means a value to split documents, and a value to merge a map branch into a single document. I'd suggest the values "split" and "merge" as the values. Those would correspond pretty closely to the current "by-topic" and "to-content".
- The attribute should remain CDATA. If an implementation wants to provide additional function, it can (as it can today). This also allows for easy migration; old values remain valid in the document, and can be updated as needed over time. Applications could support old values for a while or in a 1.x compatibility mode.

Thanks for making it this far - I'm curious what others think about this. I also realize that if the TC likes this direction, I will probably (though reluctantly) be the owner of an eventual DITA 2.0 proposal for it.

For reference, chunking through the ages:
DITA 1.0: http://docs.oasis-open.org/dita/v1.0/langspec/topicref-atts.html
DITA 1.1: http://docs.oasis-open.org/dita/v1.1/CS01/archspec/chunking.html
DITA 1.2: http://docs.oasis-open.org/dita/v1.2/os/spec/archSpec/chunking.html
DITA 1.3: http://docs.oasis-open.org/dita/dita/v1.3/os/part1-base/archSpec/base/chunking.html

Regards, Robert D. Anderson DITA-OT lead and Co-editor DITA 1.3 specification, Digital Services Group

E-mail: robander@us.ibm.com Digital Services Group
11501 BURNET RD,, TX, 78758-3400, AUSTIN, USA

dita message