dita message

Subject: Re: [dita] Lightweight DITA proposal - 13076

From: Michael Priestley <mpriestl@ca.ibm.com>
To: dita <dita@lists.oasis-open.org>
Date: Fri, 23 Mar 2012 16:07:41 -0400

Some further thoughts after the discussion at the TC call...

Topic content outside section
The section-oriented model I provided below would work for content with a clear section-oriented structure, like most reference content, or tasks. But JoAnn pointed out that a lot of looser content models start with some initial explanation before the sections begin. It seems to me there are a few different ways we might accomplish this, so I wanted to call them out for discussion/comparison.

Option 1: use a non-titled section. There's no requirement that a section have a title, generated or not. At authoring time it could be given a title like "introduction" or "overview", to act as a prompt/rationale, even if the section title is omitted at publication time. Specializers that didn't want that initial introductory section could simply omit it. This option has the cleanest match to a section-oriented specialization assembly model.

Option 2: instead of having <shortdesc> in the model, replace it with <abstract>, with a required first child of <shortdesc>, followed by any number of p/ul/ol/etc. The only unfortunate thing here would be if someone wanted to specialize to get only <shortdesc>, they would still need to retain the <abstract> element as a container.

Option 3: allow p/ul/ol/etc. before the first section - like <concept> does. At specialization time we'd need to provide a common option to turn it off/get rid of it.

Any thoughts/preferences?

The format attribute
Do we need the format attribute for references? For both topicref and xref?

I wonder if we could get by with some additional specialized elements - for example, <mapref> could set a defaulted format attribute to "ditamap" without exposing the attribute to authors.

Part of my concern with generally enabling a format attribute is the cost of authoring. It's something that has no easy parallel with HTML, so it will need some special handling in HTML-oriented editing scenarios.

One possibility would be to make it an optional attribute - like keyref. Content systems that don't need it could omit it, for example if they can work with the file extensions instead.

Missing elements
I realized that we're missing a large class of content by excluding the <pre> element - for example, any content that includes code snippets. So I think it's worth adding back in, especially given that it's allowed in HTML5, so preserves our subset compatibility.

I also initially omitted <title> from <section>. I'm now thinking this is a decision that needs to be made at the specialization level. There are valid section cases that could still warrant a user-authored title, like an <example>.

Map specialization
I didn't really talk about map specializations at all. One option would be to add the @type attribute, purely so it can be defaulted by specialized topicrefs that require a certain specialization target.

For example, a specialized <meeting> map could require specialized topicrefs that point to <minutes> topics for past meetings or to <agenda> topics for future meetings:

<meetinglist>
<pastmeetings>
<minutesref type="minutestopic" href="">
</pastmeetings>
<futuremeetings>
<agendaref type="agendatopic" href="">
</futuremeetings>
</meetinglist>

The type attribute default value would not be human-editable, but could be used by link management systems to ensure that when someone wants to add an agenda topic to the map, they are presented only with DITA topics with an appropriately matching type.

Michael Priestley, Senior Technical Staff Member (STSM)
Lead IBM DITA Architect
mpriestl@ca.ibm.com
http://dita.xml.org/blog/25

From: Michael Priestley/Toronto/IBM@IBMCA
To: dita <dita@lists.oasis-open.org>,
Date: 03/20/2012 09:36 AM
Subject: [dita] Lightweight DITA proposal - 13076
Sent by: <dita@lists.oasis-open.org>

Thanks to Kris Eberlein, Don Day, Chris Nitchie, and Robert Anderson for help whipping this initial proposal into shape.

The goal is to create a lightweight DITA framework that can ease adoption for casual, contributing, or non-technical authors. We want to make both editing and specialization easier for vendors to support by limiting options and choice points.

There is a related proposal 13051 for defining starter sets of constrained content models, where we can add more explicit specializations or examples of these constraints in use. For example, 13051 could include a starter set for the DITA concept/task/reference specializations.

Here are my thoughts for the core content models and specialization architecture. I've tried to simplify both the content models and attribute models to eliminate redundancy, and wherever possible support just one way of doing things. I've tried to simplify the specialization architecture to the point where a complete and valid specialization could be easily generated from a simple set of forms. Here they are:

Simplified topic content model
<topic> contains title, shortdesc, body
<shortdesc> contains text or ph
<body> contains section
<section> contains p, ul, ol, simpletable, image
<p> contains text, ph, xref
<ph> contains ph, text
<xref> contains text
<ul>/<ol> contain li
<li> contains p
<simpletable> contains 1 sthead (optional), strow
<sthead>/<strow> contain <stentry>
<stentry> contains p, ul, ol
<image> contains <alt>
<alt> contains text

Total: 16 elements

Simplified topic attribute model
All attributes are optional for specializers except @id on topic, @href on xref and image, and the defaulted @domains and @class attributes. Each attribute or set of attributes below can be included/excluded through predefined constraints.
Conditional content: @props on section, p, ul, ol, simpletable, strow, li, image
Content reuse: @id/@conref on section, p, ul, ol, simpletable, strow, li, image
Variable links: @keyref on xref
Variable content: @keyref on ph, image
Class extension: @outputclass on everything

A note on href: to simplify the linking model, the format and scope attributes are omitted, and their values derived from context. Format will be taken from the file extension (eg .dita, .html), and scope will be local unless the link starts with http://, when it becomes external.

Total, if all enabled: 6 editable attributes plus 2 defaulted ones

Simplified map content model
<map> contains title, topicref
<title> contains text or ph
<ph> contains ph or text
<topicref> contains topicmeta (or modify base topicref model to contain <title> as option before topicmeta)
<topicmeta> contains navtitle
<navtitle> contains text or ph

Total: 6 elements (5 if we add title to topicref base, so we can eliminate the topicmeta containment layer)

Simplified map attribute model
All attributes are optional for specializers, except the defaulted @domains and @class attributes. Each attribute or set of attributes below can be included/excluded through predefined constraints.
Referencing content: @href on topicref
Variable and metadata control: @keys on topicref
Conditional content: @props on map, topicref
Content reuse: @id/@conref on topicref
Variable text: @keyref on ph
Variable links: @keyref on topicref
Class extension: @outputclass on everything

Total, if all enabled: 6 editable attributes plus 2 defaulted ones

Some prepackaged map/attribute combos
TOC map: Referencing content (@href)
TOC/Variable link map: Referencing content, Variable/metadata control (@href, @keys)
Variable text/metadata control map: Variable/metadata control (@keys)

A final thought on the DTDs/XSDs: I think it could make sense to actually implement these as subset DTDs, rather than using redefinition of the content model entities in the existing DTDs (topic.mod etc.), so that the DTDs/XSDs themselves are easier to understand. This would give us at the TC a few more files to maintain, but considerably less complexity to manage.

---------------------------------------------------------
Now on to the specialization architecture:

Simplified specialization model
topic: can be specialized to give a semantic container name to the content. Only structural specialization allowed.
section: can create new section specializations as domains that can then be integrated into new topic types using constraints to impose order
ph: can create new ph specializations as domains
@props: can create new conditional processing attributes as domains

Total specializable elements/attributes: 4

So a new topic type would be constructed by:
1) Defining a new container name (topic element specialization)
2) Pulling in a particular set of section specializations in a particular order (like <goals>, <agenda>, <minutes>, <actionitems>)
3) Choosing the attribute and inline semantics to enable (@props and ph specializations)

A new section type domain would be constructed by:
1) Defining a new container name (section element specialization)
2) Defining its content model - I'd suggest either a single content element from p, ul, ol, simpletable, or allow everything

------------------------------------------------------

Finally, some thoughts on a specialization/authoring prototyping architecture. With the simplified rules above, it should be really simple to get someone up and running with authoring and publishing. We could even choose to define topic templates for defining new topic types and new domains, which people could fill in and then feed to a transform to get the actual DTDs/XSDs plus editor-specific and publishing-specific overrides.

Example:

<topictypedefinition id="mtgnotes">
<title>Meeting Notes</title>
<shortdesc>A topic type for taking notes on meetings</shortdesc>
<body>
<sections>

Section reference Generated heading Authoring prompt
<xref keyref="goals"/> Goals Describe the goals of the meeting
<xref keyref="agenda"/> Agenda List the agenda items
<xref keyref="minutes"/> Minutes Record discussion

</sections>
</body>
</topictypedefinition>

<sectiontypedefinition id="agenda">
<title>Agenda</title>
<shortdesc>A reusable section definition for listing agenda items</shortdesc>
<body>
<contentmodel>
<ul>...</ul>
</contentmodel>
</body>
</sectiontypedefinition>

Etc.

From the definition above, a tool could generate DTDs/XSDs, authoring templates, and output overrides. A simple form of authoring template could be just HTML5 output with the authoring prompts included, and title and section content set to contentEditable="true".

Another advantage of having a higher-level definition of the specialization might be that we could generate both a subset-DTD version (easy to understand) and a full-DTD-compatible one (easy to integrate with full DITA), in case there is someone who needs to pull specialized elements from both architectures into a single DTD.

Let me know what you think.

Michael Priestley, Senior Technical Staff Member (STSM)
Lead IBM DITA Architect
mpriestl@ca.ibm.com
http://dita.xml.org/blog/25

References:
- Lightweight DITA proposal - 13076
  - From: Michael Priestley <mpriestl@ca.ibm.com>

Section reference	Generated heading	Authoring prompt
<xref keyref="goals"/>	Goals	Describe the goals of the meeting
<xref keyref="agenda"/>	Agenda	List the agenda items
<xref keyref="minutes"/>	Minutes	Record discussion