dita message

Subject: RE: [dita] example of procedure topic

From: "Esrig, Bruce (Bruce)" <esrig@lucent.com>
To: "'France Baril'" <France.Baril@ixiasoft.com>
Date: Mon, 14 Nov 2005 06:17:04 -0500

Hi France,

This is to support the point of view that topics are standalone units with rich internal structure, requiring at least subsections or the equivalent.

There are several paradigm-setting assumptions at stake here.

For the sake of concreteness, I'll offer a new example. (For those who prefer a simple example with comprehensible and formatted output, please refer to Paul Prescod's example of the RESOLUTION section at the URL http://support.microsoft.com/kb/906294).

The following construct is from an Information Mapping (R) -based architecture that Lucent uses internally. (Note that Information Mapping uses "map" to mean "topic". This sense of "map" is what I've been trying to refer to with the term "rich topic" that I have been using in recent messages.)

Our procedure topic has a slight superset of the following form (omitting closing elements):

<LU-LiPP:procedure-map>

<LU-LiPP:access-info>

<LU-LiPP:title>

<LU-LiPP:when-to-use>

<LU-LiPP:related-information>

<LU-LiPP:before-you-begin-blk>

<LU-LiPP:procedure-blk>

<LU-LiPP:block>

</LU-LiPP:procedure-map>

Our specification for <LU-LiPP:before-you-begin-blk> contains a collection of LU-LiPP subblocks.

Our style guide gives examples of standard LU-LiPP subblocks such as:

required skills, time to perform, frequency, ..., system configuration, and maintenance state.

Each LU-LiPP subblock contains a broad range of content preceded by a specific title.

The contents of the title is up to the author, and the style guide provides standard titles for consistency.

So for example:

<LU-LiPP:procedure-map>

<LU-LiPP:title>Configure a new connection</LU-LiPP:title>

<LU-LiPP:when-to-use>Use this procedure to establish a new connection between the A system and the B system</LU-LiPP:when-to-use>

<LU-LiPP:before-you-begin-blk>

<LU-LiPP:subblock>

<LU-LiPP:title>System configuration</LU-LiPP:title>

<LU-LiPP:p>Check that both systems are up and that any processes that statically rely on the state of the A-B connection are stopped.</LU-LiPP:p>

</LU-LiPP:subblock>

</LU-LiPP:before-you-begin-blk>

<LU-LiPP:procedure-blk> ... </LU-LiPP:procedure-blk>

</LU-LiPP:procedure-map>

In response to your positions and questions ...

1. What to do below the map level ... small chunks facilitate reuse

For reuse, what we would do is chunk within the topic (LU-LiPP:procedure-map).

Not all chunks that we would create are standalone as topics are.

For example, system configuration contains information that may be identical

for multiple procedures, but when viewed alone, is useless. It is necessary to know when

to put the system in a particular configuration, and this information is communicated

by embedding the system configuration information in the context of a topic.

2. Topic titles and heading levels

It is necessary to detangle these and then to use both notions.

In fact, it is necessary to use them in combination with

- "stand-alone"

- "chunk"

- "separately-delivered"

and make careful distinctions among these.

Once you get your information divided up into small chunks with titles,

you need to provide assemblies of those chunks that have distinctive structures.

If the topic is to be understandable when presented alone, it must do some of this assembly.

On the other hand, for some contexts it may be valuable to be able to chunk

below the topic level and even to deliver chunks that are below the topic level.

When this is done, the system that delivers the chunks is responsible for maintaining

the connection between the chunk and the purpose of presenting the chunk.

Even a separately-delivered chunk is not necessarily sufficiently standalone to

be called a topic.

3. Semantic tagging

There is a downside to semantic tagging, which is the proliferation of defined tags.

This leads to a tools idea: wouldn't it be neat if the list of available tags could

be filtered by module?

As a matter of language design, we still need to provide an underlying flexibility

before specializing.

4. Unarchitected content

This is the most lamentable part of my situation.

Not all content is clean. Not all content will ever be clean. Not all content that migrates into DITA is clean.

DITA offers a topic-oriented philosophy and support for topic-oriented information representation.

Within Lucent, we have had to acknowledge that we cannot enforce good structuring practices using schema/DTD technology. Here's a compensating thought: when you offer even a few hints of how to use markup to indicate good structuring practices, receptive people tend to take advantage of those hints and create extremely well-structured information. Here's an enforcement thought: we could probably write a script that measures the complexity of a topic, and allows us to home in on and retrain groups that are blatantly violating the principle of topic orientation.

Unfortunately, even that is not so easy. What we want is for people to see the advantages of topic orientation. What really happens is that some see it and some don't. I think it's a cultural clash that cannot be resolved through the use of technology. And if DITA is to be a widely-adopted markup language, it has to accomodate those who wish to violate its core philosophy, even as it tries to persuade them to adopt its core philosophy. In many cases, people may simply not adopt the language unless they can bring their misconceptions with them; yet, if those people are allowed to adopt the language, they can in many cases eventually be persuaded to use it as intended.

5. (taking this one step further) Musings towards a more robust methodology.

Please pardon me if I'm a generation behind on the philosophy of programming languages, but some of this used to be true.

We are all in the shadow of the great "goto" debate.

Quoting from http://java.about.com/od/beginningjava/l/aa_control_1.htm

Control flow statements are the tools programmers use to make decisions about which statements to execute and to otherwise change the flow of execution in a program. The four categories of control flow statements available in Java are selection, loop, exception, and branch.

Control Flow Statements
Category	Keyword
Selection	if else switch case
Loop	while do for
Exception	throw try catch finally
Branch	return break continue label:

Branch
Branch statements explicitly redirect the flow of program execution. The most infamous branch statement in the history of computing is goto. In Java, goto is an unimplemented keyword, so it cannot be used in Java programs. In other languages, goto is used to redirect program execution to another line of code in the program by specifying a line number or other type of label. In Java, break, continue, and label: fill the role of goto, but without many of goto's negatives.

This suite of structured primitives was dedicated to eliminating un-explained transfers of control in software.

We face a similar challenge here: what are the generic primitives that provide enough structure? Here, what we want to maintain are the relationships among different groupings of information. As in programming languages, we observe containment, sequencing, modularity, abstraction (such as topic), and reuse of components.

We have a difficulty because chunking doesn't always occur at the level of our main abstraction. We want to reuse below the topic level. In software, this was handled by lowering the granularity of the fundamental unit. Functions became comparatively small, and cognitively intense. Context is required to understand how an object-oriented method is being applied.

What we are facing is a little bit different from what the programming languages faced, because we expect our results, eventually, to fit the pattern of thinking of the end user. We are not writing for authors; we are writing for readers with specific information needs. This means that we cannot simply collect a system of relevant methods into a file and call it a module. We have to structure the information so that it can be manipulated by an automatic system and presented in a comprehensible manner to our audiences. I have been assuming that the usability of our results has to be achieved through direct support in a markup language, as opposed to programming languages, in which the language and the execution model are separate.

There are really two views of topic: the independent standalone unit versus the reusable chunk. The two are not the same. The position that I am taking is that a DITA topic is an independent standalone unit with interesting internal structure. We need to acknowledge the potential complexity of that internal structure.

If we choose the view that a topic is primarily a reusable chunk, then we need to create a "rich topic" paradigm at the map level to capture our notion of independent standalone unit.

I claim, though, that this last conception is not the correct view, since reusable chunks occur at every size and do not always have titles. Our intuition for topic is that it is an independent standalone unit. The rest of our approach must be founded on that assumption.

Best wishes,

Bruce

-----Original Message-----
From: France Baril [mailto:France.Baril@ixiasoft.com]
Sent: Wednesday, November 09, 2005 6:31 PM
To: Esrig, Bruce (Bruce)
Cc: dita@lists.oasis-open.org; Michael Priestley; David Brainard; Indi Liepa
Subject: RE: [dita] example of catalog of small topics

Hi Bruce,

I have some questions and a few comments based on my understanding of your email.

I'm not sure I get what you want to add below the map level? You may have more then 1 level in a map. Moreover, it's much easier to use submaps or use a map per catalog to reuse content in multiple deliverables, then to create a huge topic and break it down. By what mechanic would you break it down?

It is important not to mix up topic title with heading level. The nesting in DITA is based on semantics, not presentation. You manage meaningful information chunks, and only then apply presentation and decide what expressions should be shown as headings.

In my experience, if a title is recurrent (same wording at a similar position), like "questions" in the previous example, it is a structural part of a topic, not a topic and therefore, should be handled as such. If you type that title manually and want the question headings "questions" to become "exercises", semantic tagging will save you a lot of time.

One thing I didn't get: how do you put content into DITA without architecturing it? You can't put semantic elements around content segments if they are not semantically accurate. Otherwise, you might as well use XHTML: XML that will give you heading levels and will not lead you into believing that the tag content is something that it is not.

I hope this brings some light, unless I'm beside the point you wanted to make?

France

France Baril

Documentation Architect/Architecte documentaire

IXIASOFT

tel.:         + 1 514 279-4942

fax:         + 1 514 279-3947

toll free:   + 1 877 279-IXIA

france.baril@ixiasoft.com

[   www.ixiasoft.com   ]

Let's Talk XML

From: Esrig, Bruce (Bruce) [mailto:esrig@LUCENT.COM]
Sent: 8 novembre 2005 10:37
To: France Baril
Cc: dita@lists.oasis-open.org; Michael Priestley; David Brainard; Indi Liepa
Subject: RE: [dita] example of catalog of small topics

France, your approach makes work for information architects! (That's a good thing provided that all information that is published is architected. We don't have that situation yet at our organization.) If you're willing to specialize <ol> to provide a new title, you can definitely reduce your need for nested structure.

Here's an excerpt from the Lucent in-house style guide, showing our use of subsections (what we call LiPP subblocks). (Please don't assume that this quotation accurately reflects the formatting of our style guide!)

This is a catalog within a single "rich topic" of various inline elements that we support. In the excerpt, <abbrev> and <command-syntax> are shown. Each inline element could be documented by a topic. The sections would contain the Attributes and Example.

Alternatively, the entire "rich topic" could be a topic with nested substructure. In that case, the default would be that the entire rich topic would be one chunk. But it would be possible for an author to break out the substructure as separate chunks if desired.

Michael would probably view the documentation on each inline element as one reusable topic. If that view is generally accepted, then the question (that this example raises) would be whether to provide something below the map level that permits aggregation of such topics into a small catalog.

Best wishes,

Bruce

============

Inline element descriptions

................................................................................................................................................................................................................................

abbrev

Indicates an abbreviation....

Attributes

expand contains the abbreviation’s expansion text.

command-syntax

Contains an inline command. It will be formatted in an ASCII-like constant-width typeface. command-syntax is one of the elements that can be used either as typical inline element or on the paragraph level, for example in an action element.

Attributes

The command-syntax element has a pgwide attribute that allows you to specify whether the ASCII text will fit to the text column or the page if command-syntax is being used on the paragraph level. For more information on the pgwide attribute, see “Using the pgwide attribute”.

Example

The ping command can be used to check TCP/IP connectivity.

-----Original Message-----
From: France Baril [mailto:France.Baril@ixiasoft.com]
Sent: Tuesday, November 08, 2005 9:26 AM
To: Michael Priestley; David Brainard; Indi Liepa
Cc: dita@lists.oasis-open.org
Subject: RE: [dita] Two proposals for nested sections

I have not followed the whole thread, I just tried to find the source of the discussion before coming up with an answer. My conclusion is that I don't see why sections are needed to specialize the example that triggered this whole thread (see my proposed solution below - one of many).

I have met some cases where I was tempted to add extra section levels. After looking at these issues with a big question mark over my head for a while, I always found solutions that were, in the end, more satisfying. My own point of vue for now is that the current model should stay as is, because it works semantically and it helps to reinforce some minimal usability rules.

Reusable Learning Object (topic/topic RLO/topic)
   Information (topic/p - or topic/section if many p per info)

   Information (topic/p - or topic/section if many p per info)
   Questions (topic/ul RLO/questions) --> XSL or CSS adds "Questions" as the title!!!
      Question title (topic/li RLO/question) followed by (topic/ph RLO/questiontitle)
         Para (topic/p)
         Para (topic/p)
         List (topic/ol RLO/ol) followed by li+
         Para (topic/p)
      Question title (topic/li RLO/question) followed by (topic/ph RLO/questiontitle)
         Para (topic/p)
         Para (topic/p)
         List (topic/ol) followed by li+
         Para (topic/p)

      Question title (topic/li RLO/question)followed by (topic/ph RLO/questiontitle)
         Para (topic/p)
         Para (topic/p)
         Table (topic/table RLO/questiontable) followed by table elements
         Para (topic/p)

From: Michael Priestley [mailto:mpriestl@ca.ibm.com]
Sent: 7 novembre 2005 22:56
To: France Baril; David Brainard; Indi Liepa
Subject: Fw: [dita] Two proposals for nested sections

I'm getting the sense that Paul and I are deadlocked, and I'd welcome some additional input on the thread, even if it's to tell me I'm crazy. If you haven't been following the thread, Paul wants to add at least one level of subsections to topic; my original suggestion was to rechunk his design (ie treat them as nested topics rather than nested sections); my fallback proposal was to create an entirely new base type, same level as topic, that allows nesting divisions (and can even embed topics if necessary).

Michael Priestley
IBM DITA Architect
SWG Classification Schema PDT Lead
mpriestl@ca.ibm.com
----- Forwarded by Michael Priestley/Toronto/IBM on 11/07/2005 10:45 PM -----

Michael Priestley/Toronto/IBM@IBMCA
11/07/2005 10:38 PM

To
"Paul Prescod" <paul.prescod@blastradius.com>

cc
dita@lists.oasis-open.org

Subject
RE: [dita] Two proposals for nested sections

"Paul Prescod" <paul.prescod@blastradius.com> wrote on 11/07/2005 07:14:02 PM: > But it will make your life as spec editor much harder (as well as > making the lives of readers harder). According to my understanding > of the proposal, we would have to go through the entire DITA spec > and everywhere it says "topic" (as in maps point to topics through > "topicref" elements), we would have to say: "topic or thingee" > (depending on what we call the thingees).
We went through a similar change earlier changing "topic type or map type" to "structural type".
>The topicref attribute > would be a misnomer because it could point to topics or thingees.
It can already point to maps, PDFs, and websites, so it's arguably already a misnomer. If necessary we can introduce a domain specialization for <articleref> > We'll have to add "thingee" to the "type" attribute.
I'm suggesting it will be a base type, same as map and topic. And same as map and topic, when an element already exists in topic, we could just keep the topic class attribute.
>What module > will the shared elements be in? That seems like a lot of painful > reworking to me.
Since <article> would contain all the same elements as topic plus some additional ones, the shared elements would be in the topic module files (topic.mod etc.).
>It also implies that things with a certain > organization are "topics" (even if they nest deeply!) whereas things > with a slightly different organization (even if they have only one > level of nesting) are not topics! They aren't articles. So I don't > know what to call them.
This is the crux of our problem. I can propose all the compromises I want, but you and I seem to fundamentally disagree on the nature of a topic in DITA. So regardless of whether my proposal addresses your immediate issue, I suspect you will not be satisfied with anything short of changing the current definition of topic.
For me it comes down to how much control we put in the hands of the map author versus the content author. On the one hand, I want the content author to have considerable freedom in how they author content: one per file or multiple per file, nested or flat, etc. On the other hand I want the map author to have considerable freedom in how they reuse and integrate content: whether it's in one file or many, nested or flat etc. The topic is a handshake between the two formats: no matter how complex the content gets, it will be consumable in topic-sized chunks that have a maximum complexity determined by the limited nesting depth of topic.
The topic is also the unit of reuse in design and processing, allowing for shared design elements and processing modules at the topic level, across multiple complex document types and applications.
Changing the size of the basic unit of reuse - on both the content, design, and processing dimensions - is not trivial. Adding even one level of nesting increases the potential complexity of a topic exponentially, with a corresponding decrease in the potential for reuse across document type and system boundaries. Allowing unlimited nesting destroys the entire idea of a topic, in the DITA architecture - the unit of reuse becomes essentially unlimited in its complexity.
I would rather simply preserve the existing architecture; my compromise proposal is article as a peer of topic. You would rather allow unlimited nesting of sections in topic; your compromise proposal is to add one level of nesting. It sounds like neither of our compromises is acceptable to the other. Perhaps the only thing we agree on is that we disagree. I suspect we need input from others, and ultimately a decision from the TC.
Michael Priestley