OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: DITA 1.2 - definition and use of controlled values


Hi, Forward-Moving TC:

We'd like to bring forward a proposal that DITA 1.2 introduce a method for defining and exchanging controlled values (such as a list of operating systems for the platform attribute). Such a method would allow adopters to share vocabulary designs while applying controlled values appropriate to their content. This method would also allow adopters to facilitate processing of exchanged content by sharing the definititions of controlled values with their content.

Once a DITA Specification 1.2 Requirements category exists, I'll add the proposal there. For now, I've attached the proposal to this note.


Hoping that's useful,


Erik Hennum
ehennum@us.ibm.com


Source: (See attached file: IssueControlledValues.dita)

Formatted: (See attached file: IssueControlledValues.html)

IssueControlledValues.dita

Title: DITA Proposed Feature - controlled values

DITA Proposed Feature - controlled values

Manage and validate enumerations for attributes as controlled content rather than as part of the DTD or XSD document design.

Longer description

The problem: The list of values for an attribute is determined by the subject matter of the content. For instance, content for a software design tool will need Architect and Developer values for the audience attribute. By contrast, content for a pharmaceutical trial reporting might need Researcher and Executive values for the same attribute. In short, content creators need to be able to extend value enumerations and share value enumerations with their content.

In traditional approaches, the DTD or XSD defines controlled values as an enumerated list for an attribute. This approach, however, is undesireable for several reasons:

  • The DTD or XSD tends to be applicable to many subject areas. Defining an enumeration in the DTD or XSD limits the potential use of a document type that would otherwise have broad applicability.

    In particular, adding a new value introduces versioning and incompatibility issues for the document type even if the structure and semantics of the elements remain constant.

  • Few content creators have the necessary expertise to modify DTD or XSDs.
  • The values must express a flat list and thus cannot express subsumption of values (such as Application Programmer and System Programmer roles within a Programmer role or variants on the Linux operating system).

Other standards have noted this problem. For instance, UBL (an OASIS standard for transactional business data) formally separates the validation of controlled values (using Schematron) from the validation of the document types (using XML Schema).

The solution: Provide DITA adopters with a method for defining controlled values as a stable, highly controlled portion of their content.

DITA provides maps for defining collections. Thus, a specialized map offers a natural DITA idiom for defining a collection of controlled values. A quick summary of the solution:

  • Define a controlled value as a key in a specialized DITA map.
  • Use the specialized map to organized controlled values in a list or hierarchy for an attribute.
  • In a content map or topic, use the controlled value in the attribute to apply the controlled value to the content.
  • Where useful, also use the specialized map to define controlled values in prose topic.
  • Where useful, use the specialized map to express relationships between controlled values.

Scope

Major

Use Case

The potential applications for controlled values range from filtering and flagging via selection attributes to subject classification. By providing a single method for defining controlled values for the entire range of applications, DITA can simplify understanding, simplify implementation, and scale seamlessly from lightweight uses of controlled values to sophisticated practices.

Technical Requirements

Defining an enumeration: Fundamentally, a controlled value is a short, readable, and meaningful identifier for a subject. Such identifiers are a good match for DITA 1.2 proposal for keys. That is, the minimum definition of a controlled value could consist of the definition for a key as part of the enumeration for a category. The following example defines a flat list of controlled values within the operating system category, introducing a <subjectdef> element (specialized from <topicref>) to distinguish the identified thing as a subject rather than a topic content object:

<subjectScheme>
  <subjectdef keys="os">
    <subjectdef keys="linux"/>
    <subjectdef keys="mswin"/>
    <subjectdef keys="zos"/>
  </subjectdef>
  ...
</subjectScheme>

For clarity and maintainability, a content provider can supply a navtitle attribute for each value. Tools can display the title to users while using the key for tagging content. The title can change without invalidating existing tagging.

<subjectScheme>
  <subjectdef keys="os" navtitle="Operating system">
    <subjectdef keys="linux" navtitle="Linux"/>
    <subjectdef keys="mswin" navtitle="Windows"/>
    <subjectdef keys="zos"   navtitle="z/OS"/>
  </subjectdef>
  ...
</subjectScheme>

The enumeration can be defined with hierarchical levels. The following example defines RedHat and SuSE as special kinds of Linux. Tools such as filtering and flagging processes can match content tagged with child values when a parent value is specified.

<subjectScheme>
  <subjectdef keys="os" navtitle="Operating system">
    <subjectdef keys="linux" navtitle="Linux"/>
      <subjectdef keys="redhat" navtitle="RedHat Linux"/>
      <subjectdef keys="suse"   navtitle="SuSE Linux"/>
    </subjectdef>
    <subjectdef keys="mswin" navtitle="Windows"/>
    <subjectdef keys="zos"   navtitle="z/OS"/>
  </subjectdef>
  ...
</subjectScheme>

As with <topicref>, properties of a <subject> defined in different maps should aggregate. This principle applies to relationships. Thus, an existing enumeration can be extended out of line by a different map attaching new values as children of an existing value. The extension can identify the parent value by key. For instance, a different map can add a Macintosh subject as a top-level value in the operating system category and add child subjects under the Windows subject.

<subjectScheme>
  <schemeref href="base_os.ditamap"/>
  <subjectdef keyref="os">
    <subjectdef keys="macos" navtitle="Macintosh"/>
    <subjectdef keyref="mswin">
      <subjectdef keys="win98" navtitle="Windows 98"/>
      <subjectdef keys="winxp" navtitle="Windows XP"/>
    </subjectdef>
  </subjectdef>
  ...
</subjectScheme>

A category can be extended upward. For instance, a content provider might create a Software category that includes operating systems.

<subjectScheme>
  <schemeref href="base_os.ditamap"/>
  <subjectdef keyref="sw" navtitle="Software">
    <subjectdef keys="os"/>
    <subjectdef keyref="app" navtitle="Applications">
      <subjectdef keys="apacheserv" navtitle="Apache Web Server"/>
      <subjectdef keys="mysql" navtitle="MySQL Database"/>
    </subjectdef>
  </subjectdef>
  ...
</subjectScheme>

When sharing controlled values, content teams must apply the same interpretation to each value. Otherwise, the value will associate dissimilar content. For instance, if one content team tags regards UNIX as including Linux while another regards Linux and UNIX and exclusive, a definition of the meaning of their values will help the two teams discover and accomodate the discrepancy. (The second team will need to define a new parent subject for their existing Linux and UNIX subjects and equate that parent subject with the other team's UNIX subject.)

To define a controlled value, a content team can supply a definitional topic (similar to an entry in an encyclopaedia or glossary) at any time:

<subjectScheme>
  <subjectdef keys="os" navtitle="Operating system">
    <subjectdef keys="linux" navtitle="Linux" href="subject/linux.dita"/>
    <subjectdef keys="mswin" navtitle="Windows"/>
    <subjectdef keys="unix"  navtitle="UNIX"  href="subject/unix.dita"/>
    <subjectdef keys="zos"   navtitle="z/OS"/>
  </subjectdef>
  ...
</subjectScheme>

<concept id="linux">
  <title>The Linux operating system</title>
  <body>
     <p>Although Linux has historical roots in UNIX, ...</p>
  </body>
</concept>

<concept id="unix">
  <title>The UNIX operating system</title>
  <body>
     <p>As a commercial operating system, UNIX differs from Linux ...</p>
  </body>
</concept>

In fact, when a subject is defined with a key but not an href, the key can be thought of as an identifier for a virtual definitional topic that isn't needed for a well-known subject but could be provided later.

By organizing controlled values in a subsuming hierarchy and defining each controlled value precisely in a topic, a content provider is in fact creating a formal taxonomy. The specialized map can define more precise hierarchical or associative relationships between subjects.

<subjectScheme>
  <subjectdef keys="mswin" navtitle="Windows">
    <hasPart>
      <subjectdef keys="iexplorer" navtitle="Internet Explorer Browser"/>
      ...
    </hasPart>
    ...
  </subjectdef>
  <relatedSubjects>
    <subjectdef keys="linux" navtitle="Linux"/>
    <subjectdef keys="mysql" navtitle="MySQL Database"/>
    ...
  </relatedSubjects>
  ...
</subjectScheme>

While available, such formality isn't required.

The existing DITA taxonomy specialization (available as a plugin for the DITA Open Toolkit) provides a precedent for defining subjects in this way.

Binding a value category to an attribute: The specialized map can specify that an attribute's values should be limited to one or more categories. The following example uses specialized elements to associate the platform attribute with the operating system category:

<valuedef type="keys">
  <enumeration type="attribute" name="platform"/>
  <subjectdef keyref="os"/>
</valuedef>

The specialized map defining the controlled values and their binding to attributes can be registered with tools or processes using tool-specific mechanisms (for instance, using catalogs).

Tagging content with values: After controlled values have been bound to an attribute and registered with tools, tools can validate the attribute. For instance, an editor could prevent the user from entering "linix" as a platform value or provide a pick list offering the titles of the Operating System subjects for selection by the user.

<note platform="linux">Don't remove the root directory.</note>

Some content providers won't want to define new attributes for categories of controlled values. Such content provider can indicate the applicability of the subject for filtering and flagging a topic with a specialized <topicapply> element.

<map>
  ...
  <topicref href="troubleshootingLamp.dita">
    <topicapply keyref="linux"/>
    ...
  </topicref>
  ...
</map>

Multiple values can be listed within the <topicapply> element:

<map>
  ...
  <topicref href="troubleshootingLamp.dita">
    <topicapply>
      <subjectref keyref="linux"/>
      <subjectref keyref="apacheserv"/>
      <subjectref keyref="mysql"/>
      <subjectref keyref="perl"/>
    </topicapply>
    ...
  </topicref>
  ...
</map>

Where a controlled value has a definitional topic, a reference to the definitional topic can be used instead of the key. (Anything else would be inconsistent with the general behavior of keys.)

<map>
  ...
  <topicref href="troubleshootingLamp.dita">
    <topicapply href="subject/linux.dita"/>
    ...
  </topicref>
  ...
</map>

Finally, to distinguish content that is truely about a subject and thus appropriate for retrieval (as opposed to merely applicable to a subject and thus appropriate for filtering and flagging), the content provider can use a specialized <topicsubject> element:

<map>
  ...
  <topicref href="linuxCapabilities.dita">
    <topicsubject keyref="linux"/>
    ...
  </topicref>
  ...
</map>
Note: For consistency, references to definitional topics might be accepted as synonyms for key values in the DITA values file.

Costs

  • Implementing the specialized map for the subject definition scheme.
  • Adding the <topicapply> and <topicsubject> elements to content maps.
  • Enhancing tooling to validate controlled values against subject definition schemes.
  • Bridging existing filtering and flagging tooling to process the <topicapply> element.

Benefits

Adopters can easily define and exchange controlled values ranging from simple flat lists to sophisticated taxonomies.

Time Required

A couple of days for the map implementations.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]