dita message

Subject: Fw: Defining topics

From: Michael Priestley <mpriestl@ca.ibm.com>
To: dita@lists.oasis-open.org
Date: Thu, 24 Nov 2005 18:51:12 -0500

I sent this to the dita-users list, but it's relevant to the ongoing discussion of best practices for when to use topics vs. sections.

Michael Priestley
IBM DITA Architect
SWG Classification Schema PDT Lead
mpriestl@ca.ibm.com
----- Forwarded by Michael Priestley/Toronto/IBM on 11/24/2005 06:50 PM -----

Michael Priestley/Toronto/IBM

11/21/2005 01:10 PM

To	dita-users@yahoogroups.com
cc
Subject	Defining topics

I'm going to take a step back and look at the process I use to define topics for an information set, whether in the context of migrating into DITA or authoring from scratch.

The process has three components:

1. Create a hierarchical outline based on the headings or hierarchy of the intended deliverable
2. Starting from the bottom (ie leaf nodes of the hierarchy), decide where the topic boundaries are
3. Also starting from the bottom, decide where the document boundaries are
4. For output, decide where the output chunking boundaries are (new functionality based on the chunk attribute)

1. THE OUTLINE
First, the outline - for this example, using a reference-oriented outline - I've documented task-oriented outline examples elsewhere (eg http://xml.coverapges.org/dita.html#priestleyWinUA2005):

Foobar guide
Overview
Foobar basics
Simple foobar example
Foobar type A
Description
Interface
Commands
a.create
syntax
parameters
example
a.edit
syntax
parameters
example
Command summary
Usage examples
In a web
On a boat
Foobar type B
(same structure)

2. IDENTIFYING TOPICS
Second, starting from the bottom, identify topics:

Foobar guide (map title)
Overview (topic)
Foobar basics (topic)
Simple foobar example (topic)
Foobar type A (topic)
Description (section)
Interface (section)
Commands (topic)
a.create (topic)
syntax (section)
parameters (section)
example (section)
a.edit (topic)
syntax (section)
parameters (section)
example (section)
Command summary (topic)
Usage examples (topic)
In a web (topic)
On a boat (topic)

At the leaf node level, I make the decision as to whether something is a section or a topic based on the attributes we've already talked about: does it have a unique title, does it make sense on its own, etc. But the size of the content is also a factor, and "makes sense on its own" is not an absolute - even if something is relatively context-dependent right now, it may evolve over time to be less so.

At the parent level, everything that contains topics is a topic, right up to the top - regardless of uniqueness of title or size of content. In a web environment, each of these containers becomes a linking hub, and will probably require a more meaningful title for the sake of search.

I also make decisions based on surroundings: for example, "Command summary" is a repeating title, but because I'm putting it at the same level as topics I make it a topic in its own right. In the case of "Description" and "Interface" I would probably leave them as sections if they are small, or graduate them to topics if they are larger.

3. IDENTIFYING DOCUMENTS

Now that I've identified topics, I can identify document boundaries. There may be branches of topics that I want to author and deliver together, and this is where I make that decision - drawing document boundaries around topics that I want to work with as a group.

Foobar guide (map)
Overview (multi-topic document)
Foobar basics
Simple foobar example
Foobar type A
Description
Interface
Commands (multi-topic document)
a.create
syntax
parameters
example
a.edit
syntax
parameters
example
Command summary
Usage examples (multi-topic document)
In a web
On a boat

The overview is grouped together to make it easier to read or print for the user, and I author it in the same way. Commands are grouped together to make it easier for readers to scan the available commands for the current foobar type. Usage examples are grouped together if they're not too large.

4. IDENTIFYING OUTPUT CHUNKING

This depends on new functionality with the chunk attribute in DITA maps, or else transform overrides for particular media or deliverables. For example, delivered in a help system for reading alongside an application, I'd probably want to chunk the output down to the topic level; but delivered on the Web or with a standalone browser, I could afford to have larger chunks. And for PDF output, I might actually want to provide chunks at the chapter level (eg one for overview, and one for each type of foobar). The decision on chunking for output will depend on the user, the media, and the subject. So a topic that is standalone in one deliverable might be pulled together with others in another deliverable. In this case, let's assume a help browser - I'm saying "chunk as-is" to identify a chunk boundary that is the same as the document boundary, and "new chunk" to identify a new boundary introduced by the map that overrides the existing document boundary:

Foobar guide (map)
Overview (chunk as-is)
Foobar basics
Simple foobar example
Foobar type A (chunk as-is)
Description
Interface
Commands (chunk as-is)
a.create (new chunk)
syntax
parameters
example
a.edit (new chunk)
syntax
parameters
example
Command summary (chunk as-is)
Usage examples (chunk as-is)
In a web (new chunk)
On a boat (new chunk)

So in this example I'm leaving the overview as a single document on output since it will still likely be printed or read together; but I'm splitting out each of the commands into its own document on output, so that users don't have to scroll with a small display. And I'm splitting the usage examples out into separate documents since it's possible that users will only want to read one and won't care about the other.

SUMMARY

The decision about what to chunk as a topic starts from the leaf nodes in the outline, and is not based on document chunking
The decision about what to chunk for documents depends on authoring requirements and default output chunking
The decision about what to chunk for output can be overridden in the map, so different maps can chunk documents differently, simply by changing the document boundaries around a topic.

The definition of a topic is not determined solely by authoring needs (that's done by combining topics into a single document where necessary) or output needs (that can be addressed with transform behaviors and the chunk attribute).

Instead the definition of the topic is based on a combination of concerns:
- its content (is it large enough to be a standalone topic? is it usable on its own, or capable of evolving into being usable on its own?)
- its structure (does it have a title, so we can recognize it, retrieve it, and reuse it at the map level?)
- its relationships (is it at the same level as other topics? does it have links to other topics? does it contain topics itself?)

I'm hoping this helps to clarify my assumptions. I'd also welcome comments from other information architects as to whether this process seems familiar or reasonable.

Michael Priestley
IBM DITA Architect
SWG Classification Schema PDT Lead
mpriestl@ca.ibm.com