OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Fw: Updated Glossary / Acronym proposal



Here is the latest version of the Glossary/Acronym proposal, based on
discussion at today's Translation Subcommittee meeting.

Thanks,

Robert D Anderson
IBM Authoring Tools Development
Chief Architect, DITA Open Toolkit
(507) 253-8787, T/L 553-8787 (Good Monday & Thursday)
----- Forwarded by Robert D Anderson/Rochester/IBM on 01/28/2008 01:08 PM
-----
                                                                           
             Robert D                                                      
             Anderson/Rocheste                                             
             r/IBM                                                      To 
                                       <dita-translation@lists.oasis-open. 
             01/28/2008 01:07          org>                                
             PM                                                         cc 
                                       "Rodolfo M. Raya"                   
                                       <rmraya@maxprograms.com>, "Bruce    
                                       Esrig" <bruce.esrig@gmail.com>,     
                                       Kara Warburton/Toronto/IBM@IBMCA    
                                                                   Subject 
                                       Updated Glossary / Acronym proposal 
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           



Hi,

This version incorporates three sets of changes, all rather minor:
* Erik's updates based on the Translation Subcommittee meeting and
following TC meeting last week
* JoAnn's modifications after those edits
* My modifications based on today's Translation Subcommittee meeting

Updates based on today's meeting:
* Moved the Spanish example of WMD so that it appears after the text
explanation
* Added a second example of the Spanish WMD translation
* Added a note about keyref resolution (clarify what qualifies as first
use)
* Added a note to clarify that the keyref value does not have to match the
acronym

(See attached file: IssueGlossary12026.dita)(See attached file:
IssueGlossary12026.html)

Thanks,

Robert D Anderson
IBM Authoring Tools Development
Chief Architect, DITA Open Toolkit
(507) 253-8787, T/L 553-8787 (Good Monday & Thursday)

IssueGlossary12026.dita

Title: DITA Proposed Feature #12026 and #12038

DITA Proposed Feature #12026 and #12038

Build on the DITA 1.1 glossary specialization for more complete support of glossary, linguistic, and semantic applications and also to assist in the resolution and handling of abbreviated-form text such as acronyms, general abbreviations, and short forms in source and target text within DITA documents.

Comparison of original and revised acronym markup

This section contrasts the original and revised markup for the acronym example terms.

Table 1. Weapons of Mass Destruction
Original markup Revised markup
Reference
<abbreviated-form
  conref="acronyms.dita#acronyms/wmd"/>
<abbreviated-form keyref="wmd"/>
English source
<abbreviated-form id="wmd">
  <expanded>Weapons of Mass
      Destruction</expanded>
  <short>WMD</short>
  <surface-form>Weapons of Mass
    Destruction (WMD)</surface-form>
</abbreviated-form>
<glossentry id="wmd">
  <glossterm>Weapons of Mass
      Destruction</glossterm>
  <glossBody>
    <glossSurfaceForm>Weapons of Mass
      Destruction (WMD)</glossSurfaceForm>
    <glossAlt>
      <glossAcronym>WMD</glossAcronym>
    </glossAlt>
  </glossBody>
</glossentry>
Spanish translation
<abbreviated-form id="wmd">
  <expanded>armas de destrucción
      masiva</expanded>
  <short>armas de destrucción 
     masiva</short>
  <surface-form>armas de destrucción 
     masiva</surface-form>
</abbreviated-form>
<glossentry id="wmd">
  <glossterm>armas de destrucción
      masiva</glossterm>
  <glossBody>
    <glossSurfaceForm>armas de 
      destrucción masiva
      </glossSurfaceForm>
    <glossAlt>
      <glossAcronym>armas de 
      destrucción masiva</glossAcronym>
    </glossAlt>
  </glossBody>
</glossentry>
Table 2. AIDS
Original proposal Revised markup
Reference
<abbreviated-form
  conref="acronyms.dita#acronyms/aids"/>
<abbreviated-form keyref="aids"/>
English source
<abbreviated-form id="aids">
  <expanded>acquired immunodeficiency
     syndrome</expanded>
  <short>AIDS</short>
  <surface-form>acquired 
     immunodeficiency syndrome (AIDS)
     </surface-form>
</abbreviated-form>
<glossentry id="aids">
  <glossterm>acquired immunodeficiency
      syndrome</glossterm>
  <glossBody>
    <glossSurfaceForm>acquired
      immunodeficiency syndrome (AIDS)
      </glossSurfaceForm>
    <glossAlt>
      <glossAcronym>AIDS</glossAcronym>
    </glossAlt>
  </glossBody>
</glossentry>
Spanish translation
<abbreviated-form id="aids">
  <expanded>síndrome de 
    inmuno-deficiencia adquirida
    </expanded>
  <short>SIDA</short>
  <surface-form>síndrome de
    inmuno-deficiencia adquirida
    (SIDA)</surface-form>
</abbreviated-form>
<glossentry id="aids">
  <glossterm>síndrome de
      inmuno-deficiencia adquirida
      </glossterm>
  <glossBody>
    <glossSurfaceForm>síndrome de
      inmuno-deficiencia adquirida
      (SIDA)</glossSurfaceForm>
    <glossAlt>
      <glossAcronym>SIDA</glossAcronym>
    </glossAlt>
  </glossBody>
</glossentry>

Single sourcing for terminology applications

This section gives examples of how subsets of the glossentry markup can be used for different applications making use of terms.

Usage for acronym resolution

An adopter interested only in term resolution for acronyms can declare an acronym with a glossentry topic similar to the following example:

<glossentry id="abs">
  <glossterm>Anti-lock Braking System</glossterm>
  <glossBody>
    <glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm>
    <glossAlt>
      <glossAcronym>ABS</glossAcronym>
    </glossAlt>
  </glossBody>
</glossentry>

The adopter can declare a key for the acronym using the standard DITA 1.2 keyref mechanism:

<map>
  ...
  <topicref href="maintcar.dita"/>
  ...
  <glossref keys="abs" href="antiLockBrake.dita"/>
  ... key declarations for other referenced acronyms ...
</map>

The adopter can then refer to the acronym using the standard DITA 1.2 keyref mechanism:

<task id="maintcar">
  ...
    <info>The <abbreviated-form keyref="abs"/> will prevent the car from skidding ...</info>
  ...
</task>

Processes should resolve the "abs" reference to the <glossSurfaceForm> text in introductory contexts and to the <glossAcronym> text in other contexts.

Note that the keyref value does not need to match the acronym. In fact, using a more qualified value for the keyref will reduce conflicts in situations where the same acronym may resolve in many ways. For example, an information set could use “cars.abs” as the key for Anti-lock Braking System, and “ship.abs” to refer to the American Bureau of Shipping.

Usage for glossary publishing

An adopter interested only in traditional glossary publishing can explain one sense of a term with a glossentry topic similar to the following example:

<glossentry id="abs">
  <glossterm>Anti-lock Braking System</glossterm>
  <glossdef>A brake technology that minimizes skids.</glossdef>
  <glossBody>
    <glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm>
    <glossAlt>
      <glossAcronym>ABS</glossAcronym>
    </glossAlt>
</glossentry>

The adopter can then pull together a subset of the defined terms for a deliverable as in the following example:

<map>
  ...
  <topichead navtitle="glossary">
    <topicref href="antiLockBrake.dita"/>
    ... other terms in the glossary for this deliverable ...
  </topichead>
</map>

To produce a traditional glossary, a process should sort the terms included in a deliverable and list the explained senses under each term.

Usage for both acronym resolution and glossary publishing

Adopters need not declare the same acronym in different ways for different purposes but instead can establish a declaration of acronym terms for multiple purposes. An adopter who needs both to refer to an acronym and list the acronym in a published glossary would provide an explanation of the acronym as in the following example:

<glossentry id="abs">
  <glossterm>Anti-lock Braking System</glossterm>
  <glossdef>A brake technology that minimizes skids.</glossdef>
  <glossBody>
    <glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm>
    <glossAlt>
      <glossAcronym>ABS</glossAcronym>
    </glossAlt>
  </glossBody>
</glossentry>

The glossary can include the expanded acronym (as shown in the following example) as well as glossary term that are not acronyms. In addition, the team can create acronyms that are referenced but not included in the glossary:

<map>
  ...
  <topicref href="maintcar.dita"/>
  ...
  <topichead navtitle="glossary">
    <topicref keys="abs" href="antiLockBrake.dita"/>
    ... other referenced terms in the glossary ...
  </topichead>
  ... key declarations for other referenced acronyms that aren't in the glossary ...
</map>

The adopter can still refer to the acronym with the <abbreviated-form> element as in the following example:

<task id="maintcar">
  ...
    <info>The <abbreviated-form keyref="abs"/> will prevent the car from skidding ...</info>
  ...
</task>

Processing for term resolution to either the <glossSurfaceForm> or <glossAcronym> text and processing for glossary publishing work as before.

Usage for populating a terminology database

While a number of text analysis tools exist, the challenge for adopters is populating the terminology database that enables use of such tools. Published glossaries provide a practical source for terminology to populate such terminology databases.

An adopter whose requirements include not only acronym resolution and glossary publishing requirements but populating a terminology database can create glossentry topics similar to the following:

<glossentry id="abs">
  <glossterm>Anti-lock Braking System</glossterm>
  <glossdef>A brake technology that minimizes skids.</glossdef>
  <glossBody>
    <glossPartOfSpeech value="noun"/>
    <glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm>
    <glossAlt>
      <glossAcronym>ABS</glossAcronym>
      <glossStatus value="preferred"/>
      <glossUsage>Recommended because more readers are familiar with the acronym than the term.</glossUsage>
    </glossAlt>
    <glossAlt>
      <glossSynonym>Anti-skid Brakes</glossSynonym>
      <glossStatus value="restricted"/>
      <glossUsage>Allowed in legacy content but not in new content.</glossUsage>
    </glossAlt>
  </glossBody>
</glossentry>

As illustrated by these example, adopters can scale up for more sophisticated applications as their requirements change by taking advantage of optional elements to provide additional detail about the term.

Acronym terms

This section discusses the subset of the glossentry vocabulary specific to acronyms.

Reference for acronym terms

To use the glossentry topic for acronym resolution, the writer takes advantage of the following elements

Base element Specialized element Content Purpose
<concept> <glossentry>
  1. one <glossterm>
  2. one <glossBody>
Declares a term, its acronym, and its surface form.
<title> <glossterm> title content for <glossterm> for consistency with DITA 1.1 Specifies a term that also has an acronym form.
<conbody> <glossBody>
  1. one <glossSurfaceForm>
  2. one <glossAlt> containing an acronym
Contains detail about the term.
<p> <glossSurfaceForm> text, <term>, <keyword>, or <tm> content Specifies an unambiguous presentation of an acronym such as providing the term with the acronym in parentheses. The surface form is suitable to introduce the term in new contexts.
<section> <glossAlt>
  1. one <glossAcronym>
Identifies an alternate term, in this case, an acronym.
<title> <glossAcronym> text, <term>, <keyword>, or <tm> content Identifies the acronym for the term.

The <glossentry> topic provides additional subelements that are optional but available to scale up for single sourcing for additional purposes such as glossary publishing of the acronym (see Technical Requirements below).

Two new domains complement the glossary entry topic to make it easy to refer to acronyms (as shown in the example of acronym resolution):

  • A map domain specializes <topicref> to provide a <glossref> element to define a key for an acronym.
  • A topic domain specializes <term> to provide an <abbreviated-form> element to insert the acronym into the content of another topic.

Rendition of Abbreviated Forms

When the writer provides a keyref to a glossentry topic that contains a <glossSurfaceForm> element, a process should emit the surface form in introductory contexts where the term might be unfamiliar to the reader or in other contexts where a precise term is appropriate.

For instance, a process composing a book deliverable should emit the surface form on the first reference to the glossentry topic within the book or for every reference within a copyright or a warranty-related warning. A process generating an online page should emit the surface form as a hover tooltip on every instance of the term. A glossary publishing process should emit the surface form for the term.

When the writer uses the <abbreviated-form> element to refer to a glossentry topic, processing resolves the term reference to the text of the <glossSurfaceForm> element in introductory contexts and to text of the <glossAcronym> element in other contexts.

For instance, if the topic with the keyref to the "abs" key provided the first appearance of the ABS term within a book, the sentence could be rendered as follows:

"The Anti-lock Brake System (ABS) will prevent the car from skidding in adverse weather conditions."

If the ABS term had appeared previously within the book, the same sentence could instead be rendered as follows:

"The ABS will prevent the car from skidding in adverse weather conditions."

Translation Issues for Abbreviated Forms

The following cases for abbreviated forms must be contemplated when working with documents that require translation:

  • The source and target languages may have different forms for a term. One language may lack an abbreviation or acronym that's recognized in the other, or the preferred term may be an abbreviation or acronym in one language but the expanded form in another.

    Translation workbenches don't allow the translator to change markup during translation. That's necessary for the translation workbench to apply to any markup language without building in an awareness of specific markup vocabularies. For that reason, the text of an acronym and surface form may be provided in the source language but omitted or translated to the same text in a target language while preserving the markup structure.

    The following example illustrates this approach for the English source topic:

    <glossentry id="wmd" xml:lang="en">
      <glossterm>Weapons of Mass Destruction</glossterm>
      <glossBody>
        <glossSurfaceForm>Weapons of Mass Destruction (WMD)</glossSurfaceForm>
        <glossAlt>
          <glossAcronym>WMD</glossAcronym>
        </glossAlt>
      </glossBody>
    </glossentry>

    Term resolution processing uses the supplied text from the <glossAcronym> and <glossSurfaceForm> elements in the same way as the source English text.

    Term resolution processing should always ignore empty elements. If the <glossAcronym> and <glossSurfaceForm> elements are empty, an <abbreviated-form> reference should resolve to the <glossterm> text. Thus, if allowed by the translation workbench, the translator could take advantage of standard processing by omitting the text translation for both the <glossAcronym> and <glossSurfaceForm> elements. The result of processing an empty element should be the same as if the translator had copied the <glossterm> text into the empty element.

    <glossentry id="wmd" xml:lang="es">
      <glossterm>armas de destrucción masiva</glossterm>
      <glossBody>
        <glossSurfaceForm></glossSurfaceForm>
        <glossAlt>
          <glossAcronym></glossAcronym>
        </glossAlt>
      </glossBody>
    </glossentry>

    However, translation processing systems may not permit the translator to leave an element empty and will generate an error message that the translation is incomplete. In that case, the translator must duplicate the <glossterm> in the <glossAcronym> and <glossSurfaceForm> elements.

    <glossentry id="wmd" xml:lang="es">
      <glossterm>armas de destrucción masiva</glossterm>
      <glossBody>
        <glossSurfaceForm>armas de destrucción masiva</glossSurfaceForm>
        <glossAlt>
          <glossAcronym>armas de destrucción masiva</glossAcronym>
        </glossAlt>
      </glossBody>
    </glossentry>
  • In some languages, like Spanish, abbreviated-form expansion should be written in lower case. This can lead to a grammatical error if the first appearance of an abbreviated form occurs at the beginning of a sentence. The same problem may arise with the indefinite article in English 'a' or 'an' depending on whether the text to be inserted begins with a vowel. It is up to the composition/display software to handle this. For example, the acronym for AIDS should be translated as:

    <glossentry id="aids" xml:lang="es">
      <glossterm>síndrome de inmuno-deficiencia adquirida</glossterm>
      <glossBody>
        <glossSurfaceForm>síndrome de inmuno-deficiencia adquirida (SIDA)</glossSurfaceForm>
        <glossAlt>
          <glossAcronym>SIDA</glossAcronym>
        </glossAlt>
      </glossBody>
    </glossentry>

    Normally the <glossSurfaceForm> text from the above example could not be used at the beginning of a sentence, because it begins with a lower case letter. It is up to the composition software for the given language to cope with this input.

  • Abbreviated forms can cause problems for inflected languages because abbreviated form expansion needs to be presented in the nominative case, without any inflection. This can be achieved with a surface form that provides the full form in parentheses immediately following the acronym. For example, the Polish acronym for the European Union is:

    <glossentry id="eu" xml:lang="pl">
      <glossterm>Unia Europejska</glossterm>
      <glossBody>
        <glossSurfaceForm>UE (Unia Europejska)</glossSurfaceForm>
        <glossAlt>
          <glossAcronym>UE</glossAcronym>
        </glossAlt>
      </glossBody>
    </glossentry>

    Using the above construct enables automated handling of the abbreviated form in Polish without causing any problems with grammatical inflection. For example, when stating that something occurred within the EU, the inflected form in Polish caused by the use of the locative case would have to be used. For the actual abbreviated form itself this is not a problem, since abbreviated forms are not inflected. Consider, for example, the phrase "In the European Union (EU) there are many institutions…":

    "W Unii Europejskiej (UE) jest wiele instytucji…"

    However, by allowing the translator to control how the text is displayed via the <glossSurfaceForm> element, the first occurrence for the abbreviated form allows the translator to use the following acceptable construct:

    "W UE (Unia Europejska) jest wiele instytucji…"

All terms

This section provides a discusses the full glossentry markup available for any terminology application.

Longer description

DITA 1.1 introduce a simple glossary specialization to meet basic needs for publication as part of bookmap.

The DITA 1.1 glossary specialization, however, is too simple to support many common glossary applications. For instance, many content publishers need to distinguish an abbreviation from the full term. In addition, a more complete representation of terminology can support processing such as the following:

Translation
The glossary identifies key terminology for human translators as well as the meaning that the term must retain in translation. In addition, the identification of special terms and the terminology data for those terms provides a dictionary that helps to enable automated translation of mentions of terms and content in the vicinity of such mentions.

Key terminology standards include TBX.

Semantic search
The glossary identifies the subjects associated with specific terms to enable indexing content based on the meaning of terms rather than the surface forms in the text.

Key semantic standards include TopicMaps and SKOS.

Handling of abbreviated forms

Abbreviated forms, such as acronyms, are ubiquitous in technical documentation. Abbreviated forms are a special case of glossary term because they need to be expanded to the full form under some conditions (such as the first encounter within a printed document). In electronic published documents, abbreviated form expansions can also be made available in the form of a hyperlink or 'tool tip' mechanism. In addition, the expanded text of abbreviated forms should be available for automatic inclusion in glossary entries for the publication. This proposal relates to all types of abbreviations, such as acronyms, initialisms, apocope, clipping, elision, syncope, syllabic abbreviation, and portmanteau.

To enable these applications, DITA 1.2 allows additional detail about the term and additional methods for referring to terms that can deliver either abbreviated or surface forms of the term.

Statement of Requirement

The following requirements apply to glossary terms generally:

  • Supply the preferred term and the verbal definition of the subject of the term including an explanation of the cases that are in or out of scope for the term.
  • List synonyms, abbreviations, acronyms, and other alternative terms with the same meaning. (Called a synomym set by WordNet; please see http://en.wikipedia.org/wiki/Synsets.)
  • Directions for the correct use of the preferred or alternate terms.
  • Extensibility for more detailed terminology data.

In addition, abbreviated forms and their translations require special handling:

  • Some abbreviated forms are never translated, especially those that are intended for a knowledgeable, technical audience, as well as those that refer to standardized international concepts, such as XML.
  • Some abbreviated forms represent a brand name for which the original expanded form is no longer used or is secondary to the abbreviated forms.
  • Abbreviated forms such as xml, jpg, and html are typically used in their original form, that is, they may be quoted in lower case, and they are not translated.
  • Abbreviated forms that have equivalent expressions in other languages are typically translated. United Nations (UN) and Weapons of Mass Destruction (WMD) have equivalents in other languages besides English. For instance, the French translation of “UN” is “ONU”.
  • Some abbreviated forms are translated for clarity and also referred to in their original untranslated form. For instance, OASIS may be translated so that readers understand its significance in their native language but the original acronym would be retained in the translation to facilitate electronic search.
  • The expanded form of an abbreviated form in the target language may require a different formulation than the expanded form of the abbreviated form in the source language, depending on the target audience and the grammatical features of the target language.

For example, the surface form for an abbreviated form in English might consist of the abbreviated form followed by its expanded form in parentheses. By contrast, the translated version might consist of the expanded form followed by the abbreviated form in parentheses. The translated version might also include the English and the translation.

For example, in a Polish book on Java Web programming, the first reference to JSP may appear as follows:

"JSP (ang. Java Server Pages)"

Another example from a publication concerning OASIS:

"OASIS (ang. Organization for the Advancement of Structured Information Standards—organizacja dla propagowania strukturalnych standardów infomracyjnych)"

In the first example, the translator assumes the reader will not require a translation of the English abbreviated form. In the second example, the translator assumes the reader may not understand the English expanded form and therefore adds the translation.

Use Cases

  • Publishing an explanatory glossary for users.
  • Providing referenceable terms that resolve to expanded forms when appropriate and that resolve in different ways in translation.
  • Guiding writers on the correct use of terms.
  • Informing translators on the correct interpretation of terms.
  • Keeping a terminology database up to date by importing maintained terms from content sources.
  • Supplying automated translation, text analysis, or search indexing tools with linguistic data.
  • Defining subjects for classification.

Scope

Moderate: adding elements to one specialized topic, providing a map domain for defining keys, and providing an element domain for referring to keys.

Technical Requirements

The full set of elements provided by the expanded glossentry topic includes the following elements:

Base element Specialized element Content Purpose
<concept> <glossentry>
  1. one <glossterm>
  2. zero or one <glossdef>
  3. zero or one <prolog>
  4. zero or one <glossBody>
  5. zero or one <related-links>
Specifies the preferred and alternate forms of a term and the subject designated by those terms within a glossary or other kind of terminology set.
Note: Some terminology discussions use "concept" to denote the meaning of a term. For the DITA community, however, "concept" has a strong association with the core DITA concept topic type. To avoid confusion, this proposal denotes the meaning of a term with "subject" (which has appropriate connotations by way of "subject classification" and the Dublin Core subject property).
<title> <glossterm> <glossAbbreviation> <glossAcronym> <glossShortForm> <glossSynonym> title content for <glossterm> for consistency with DITA 1.1; text, <term>, <keyword>, or <tm> content for the other <title> specializations Identifies the role of one term with respect to other variant terms. The <glossterm> element appears within the <glossentry> element to indicate the preferred term. The other <title> specializations can appear within the <glossAlt> element to indicate alternative forms with the same meaning. In particular, where <glossShortForm> to indicate a shorter alternative to the preferred term.
<abstract> <glossdef> section content or <shortdesc> Provides a verbal definition of the subject of a term for writers and users.
<conbody> <glossBody>
  1. zero or one <glossPartOfSpeech>
  2. zero or one <glossStatus>
  3. zero or more <glossProperty>
  4. zero or one <glossSurfaceForm>
  5. zero or one <glossUsage>
  6. zero or one <glossScopeNote>
  7. zero or more <glossSymbol>
  8. zero or more <note>
  9. zero or more <glossAlt>
Represents terminology detail. The part of speech applies to all term forms to encourage consistency of the alternate forms with the preferred term. The surface form presents the term in an unambiguous way. The status indicates the overall status of the subject of the term. The <glossProperty> and <note> elements are extension points for more detail about the preferred term or its subject (such as the linguistic properties from basic or full TBX).
<data> <glossPartOfSpeech> value attribute enumerated as noun, properNoun, verb, adjective, or adverb; empty content Identifies the part of speech for the preferred and alternate terms. Alternate terms must have the same part of speech as the preferred term because all terms in the glossentry topic designate the same subject. If the part of speech isn't specified, the default is a noun for the standard enumeration.
Note: The standard enumeration is extensible or replaceable. The enumeration is validated by means of the proposed controlled values mechanism or through processing rather than validated as an XML enumeration.
<data> <glossStatus> value attribute enumerated as preferred, restricted, prohibited, or obsolete; empty content Identifies the usage status of a preferred or alternate term. If the status isn't specified, the <glossterm> provides a preferred term and an alternate term provides an allowed term.
Note: This enumeration must be extensible or replaceable. The enumeration is validated by means of the proposed controlled values mechanism or through processing rather than validated as an XML enumeration.
<data> <glossProperty> data content An extension point for linguistic or semantic properties such as the gender of the term.
<p> <glossSurfaceForm> text, <term>, <keyword>, or <tm> content Specifies an unambiguous presentation of the term that may combine multiple forms. For instance, for an acronym, the <glossSurfaceForm> might provide the full form as well as the acronym in parentheses. The surface form is suitable to introduce the term in new contexts.
<note> <glossUsage> note content Any information about the correct usage of the term.
<note> <glossScopeNote> note content A clarification of the subject designated by the terms such as examples of included or excluded companies or products. For instance, a scope note for "Linux" might explain that the term doesn't apply to UNIX products and give some examples of Linux products that are included as well as UNIX products that are excluded.
<image> <glossSymbol> image content Identifies a standard icon associated with the subject of the term.
<section> <glossAlt>
  1. one <glossAbbreviation>, <glossAcronym>, <glossShortForm>, or <glossSynonym>
  2. zero or one <glossStatus>
  3. zero or more <glossProperty>
  4. zero or one <glossUsage>
  5. zero or more <note>
  6. zero or more <glossAlternateFor>
Identifies a variant term for the preferred term. Any list of alternative terms is, of course, specific to the language, so translation may result in empty elements.
<xref> <glossAlternateFor> Empty content Indicates when a variant term has a relationship to another variant term as well as to the preferred term.

The following example shows the minimum declaration of a term:

<glossentry id="highavail">
    <glossterm>High Availability</glossterm>
</glossentry>

The following example shows a detailed glossary entry specifying the usage for the preferred and alternate terms:

<glossentry id="usbfd">
  <glossterm>USB flash drive</glossterm>
  <glossdef>A small portable drive.</glossdef>
  <glossBody>
    <glossPartOfSpeech value="noun"/>
    <glossUsage>Do not provide in upper case (as in "USB Flash Drive") because that suggests a trademark.</glossUsage>
    <glossAlt>
      <glossAcronym>UFD</glossAcronym>
      <glossUsage>Explain the acronym on first occurrence.</glossUsage>
    </glossAlt>
    <glossAlt id="memoryStick">
      <glossSynonym>memory stick</glossSynonym>
      <glossUsage>This is a colloquial term.</glossUsage>
    </glossAlt>
    <glossAlt>
      <glossAbbreviation>stick</glossAbbreviation>
      <glossStatus value="prohibited"/>
      <glossUsage>This is too colloquial.</glossUsage>
      <glossAlternateFor href="javascript:void(0);"/>
    </glossAlt>
    <glossAlt>
      <glossAbbreviation>flash</glossAbbreviation>
      <glossStatus value="prohibited"/>
      <glossUsage>This short form is ambiguous.</glossUsage>
    </glossAlt>
  </glossBody>
</glossentry>

Using the standard keyref mechanism, the writer can assign a key to the declaration topic and refer to the key to insert the preferred term. The benefit in using a reference is that the preferred term can be maintained in one place:

<map>
  ...
  <topicref keys="reliability" href="highavail.dita"
        linking="none" toc="no" print="no" search="no"/>
  ...
  <topicref href="configdb.dita"/>
  ...
</map>

<task id="configdb">
  <title>Configuring the database.</title>
  ...
    <context>To enable <term keyref="reliability"/>,
        you configure the database</context>
  ...
</task>

Two new domains support easy definition and use of keys for glossary entry topics:

  • A map domain specializes <topicref> to provide a <glossref> element. The <glossref> element requires the keys and href attributes and defaults the print, search, and toc attributes to the "no" value and the linking attribute to the "none" value. The <glossref> element is always empty.
  • A topic domain specializes <term> to provide an <abbreviated-form> element. The <abbreviated-form> element requires the keyref attribute and is always empty.

Writers can set the linking attribute to the "target" value on the <glossref> element to enable linking from the use to the glossary term. The <glossref> element is only a convenience. Writers can always use the standard capabilities of the keyref mechanism. For instance, writers can use the <topicref> element with a keys attribute to pull a glossary topic into a TOC context while defining a key.

When the writer uses the <abbreviated-form> element to refer to a glossentry topic, the process performs the following checks in the attempt to find an abbreviated or surface form with text for the reference, skipping all subsequent checks once the text has been found:

  1. If the context for the term reference requires the surface form (as described in Rendition of Abbreviated Forms) and the glossentry topic provides a <glossSurfaceForm> element that contains text, use the surface form.
  2. If the context doesn't require the surface form and the glossentry topic has at least one <glossAcronym> or <glossAbbreviation> alternate form that contains text and doesn't have a <glossStatus> of prohibited or obsolete, use the abbreviated form. (If the glossentry topic has multiple abbreviated forms that qualify and one has a <glossStatus> of preferred, use that abbreviated form; otherwise, use the first qualified abbreviated form.)
  3. Otherwise, use the <glossterm> text for the term.
Note: As with any keyref situation, it is possible to use more than one keyref value to refer to the same target. So, when determining the first occurrence, processors should consider all references to the same target topic, not just references that use the same keyref value.

Writers can also use the <term> element with a keyref attribute to refer to a glossentry. Processing inserts text from the glossentry topic only when the referencing <term> element doesn't contain text. As a result, writers can use the <term> element to delimit terms within content while identifying the corresponding glossary entry. That is, the <term> element can provide a context-specific surface form as its content where appropriate.

For authoring convenience, a <glossgroup> topic can contain multiple <glossentry> topics:

Base Element Content Purpose
<concept> <glossgroup>
  1. one <title>
  2. zero or one <prolog>
  3. zero or more <glossgroup> or <glossentry> topics
Groups a set of glossary entries for some purpose, for instance, for convenient maintenance based on the alphabetic collation of the preferred terms or on the subject matter covered by the terms.

Relationships between the subjects of terms (such as the hypernym or kind-of relationship and the holonym or part-of relationships specified by WordNet) can be specified for glossary topics by a subject scheme map. (Please see the Proposal 12031 for Controlled Values.)

New or Changed Specification Language

The Language Reference for the glossentry topic should be revised to reflect the contents of this proposal including translation considerations and their impact on the use of abbreviations.

Costs

  • Implementation of the DTD and Schema changes for the glossentry topic, of the map domain for the <glossref> element, of the topic domain for the <abbreviated-form> element, and of the glossgroup topic.

  • Implementation of special processing to emit the surface form when appropriate.

Benefits

  • Glossary data can serve publishing, text analysis, search, translation processing, and other applications.

    In particular, abbreviated forms can be handled in a uniform and consistent manner by putting resolution of the abbreviated form under the control of the composition software so that glossary, tooltip, and first forms can be provided as required to meet the end-user requirements.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]