OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

dita-translation message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Fw: [dita] Issue for unifying acronyms and glossary



I know that Erik is not able to send notes to the dita-translation list, so
I'm forwarding this on his behalf.

Thanks,

Robert D Anderson
IBM Authoring Tools Development
Chief Architect, DITA Open Toolkit
(507) 253-8787, T/L 553-8787 (Good Monday & Thursday)
----- Forwarded by Robert D Anderson/Rochester/IBM on 01/08/2008 07:22 AM
-----
                                                                           
             Erik                                                          
             Hennum/Oakland/IB                                             
             M                                                          To 
                                       "Andrzej Zydron"                    
             01/08/2008 12:15          <azydron@xml-intl.com>, Don         
             AM                        Day/Austin/IBM, "Gershon L Joseph"  
                                       <gershon@tech-tav.com>, "JoAnn      
                                       Hackos"                             
                                       <joann.hackos@comtech-serv.com>,    
                                       "Kara Warburton" <KARA@CA.IBM.COM>, 
                                       Robert D Anderson/Rochester/IBM,    
                                       "Rodolfo M. Raya"                   
                                       <rodolfo@heartsome.net>             
                                                                        cc 
                                                                           
                                                                   Subject 
                                       RE: [dita] Issue for unifying       
                                       acronyms and glossary(Document      
                                       link: Robert D Anderson)            
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           



Hi, Esteemed Translation and Terminology Experts:

Thanks for considering the proposal to unify terminology markup in DITA.
I've appended a revision based on the feedback you gave today to Robert:

(See attached file: IssueGlossary12026.html)

The main changes:

   Added a usage section at the front to clarifying single sourcing of
   acronym resolution, glossary publication, and termbase population.
   Assembled the subset of the full vocabulary used for acronyms as a
   separate section (labelled "Acronym Terms") so it can be considered
   easily without confusion.
   Revised the section on "Translation Issues for Abbreviated Forms."

I understand from Robert that there's some concern that the attempt to
integrate the proposals is hostile to the acronym proposal and the
Translation Subcommittee.

That is not my intent in the slightest. The clear value of what the
Translation Subcommittee has done is driving the attempt to unify the
proposals and to capture the feedback from the Translation Subcommittee.
If this revision falls short, please let me know where and how so we can
see if we can fix those shortcomings.

The concern motivating this effort is improving the environment for DITA
adopters who have single sourcing requirements for terminology.

Here's a real example.  A DITA adopter has content that includes the SMTP
acronym.  To minimize typos and enable translation, the adopter wants
writers to refer to the acronym instead of just embed the acronym in the
text.  Many end users don't know that the SMTP acronym means Simple Mail
Transfer Protocol, so the adopter both wants to expand SMTP on first
appearance and wants to include an explanation of the SMTP acronym in the
glossary.

Should the adopter have to maintain Simple Mail Transfer Protocol as a
surface form in two different places -- once for acronym resolution and
separately for glossary publication?  Wouldn't it be better if the adopter
could maintain the SMTP acronym, the Simple Mail Transfer Protocol surface
form, and the explanation in one place and use that declaration for all
processing (just like other XML single sourcing)?

Finally, I wanted to address concerns that the glossary proposal was a
recent addition to DITA 1.2:

   Acronyms were part of the original glossary proposal in 2005:
   http://www.oasis-open.org/apps/org/workgroup/dita/email/archives/200511/msg00002.html


   When the original glossary proposal was pared back for DITA 1.1, the
   explicit intent was to reintroduce abbreviations in a future version.
   ("For an example of the limitations to be addressed in DITA 1.2, many
   content publishers will need to distinguish a labelling abbreviation
   from the full term.")
   http://www.oasis-open.org/apps/org/workgroup/dita/email/archives/200603/msg00058.html


   The approved DITA 1.1 proposal also noted that a future version would
   restore the deferred parts of the original proposal
   http://www.oasis-open.org/apps/org/workgroup/dita/download.php/1751/Issue14a.html


   Glossary was on the top 5 list that IBM submitted for DITA 1.2 and was
   accepted for the official list from the start of work on DITA 1.2:
   http://www.oasis-open.org/apps/org/workgroup/dita/email/archives/200703/msg00071.html


   The glossary proposal came up several times during the summer -- for
   instance:
   http://www.oasis-open.org/apps/org/workgroup/dita/email/archives/200705/msg00032.html


   The proposal was brought forward before the deadline for DITA 1.2:
   http://www.oasis-open.org/apps/org/workgroup/dita/download.php/26130/IssueGlossary12026.html

   The proposal was scheduled for a vote and approved:
   http://www.oasis-open.org/apps/org/workgroup/dita/download.php/26192/DitaTCMeetingMinutes071120.txt

   http://www.oasis-open.org/apps/org/workgroup/dita/download.php/26263/DitaTCMeetingMinutes071127.txt


I know that the acronym proposal has its own history.  From my perspective,
both proposals have strengths, and the union of those strengths would be a
better solution for DITA adopters than separate, independent vocabularies.


Thanks again for keeping an open mind about this issue,


Erik Hennum
ehennum@us.ibm.com
Title: DITA Proposed Feature #12026 and #12038

DITA Proposed Feature #12026 and #12038

Build on the DITA 1.1 glossary specialization for more complete support of glossary, linguistic, and semantic applications and also to assist in the resolution and handling of abbreviated-form text such as acronyms, general abbreviations, and short forms in source and target text within DITA documents.

Terminology applications

This section gives examples of how subsets of the glossentry markup meet requirements for different applications.

Usage for acronym resolution

An adopter interested only in term resolution for acronyms can declare an acronym with a glossentry topic similar to the following example:

<glossentry id="abs">
  <glossAcronym>ABS</glossAcronym>
  <glossBody>
    <glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm>
  </glossBody>
</glossentry>

The adopter can declare a key for the acronym using the standard DITA 1.2 keyref mechanism:

<map>
  ...
  <topicref href="maintcar.dita"/>
  ...
  <glossref keys="abs" href="antiLockBrake.dita"/>
  ... key declarations for other referenced acronyms ...
</map>

The adopter can then refer to the acronym using the standard DITA 1.2 keyref mechanism:

<task id="maintcar">
  ...
    <info>The <abbreviation keyref="abs"/> will prevent the car from skidding ...</info>
  ...
</task>

Processes can resolve the "abs" reference to the <glossSurfaceForm> text in introductory contexts and to the <glossAcronym> text in other contexts.

Usage for glossary publishing

An adopter interested only in traditional glossary publishing can explain one sense of a term with a glossentry topic similar to the following example:

<glossentry id="abs">
  <glossAcronym>ABS</glossAcronym>
  <glossdef>A brake technology that minimizes skids.</glossdef>
</glossentry>

The adopter can then pull together a subset of the defined terms for a deliverable as in the following example:

<map>
  ...
  <topichead navtitle="glossary">
    <topicref href="antiLockBrake.dita"/>
    ... other terms in the glossary for this deliverable ...
  </topichead>
</map>

To produce a traditional glossary, a process can sort the terms included in a deliverable and list the explained senses under each term.

Usage for single sourcing term resolution and glossary publishing

Adopters don't have to declare the same acronym in different ways for different purposes but instead can single source a declaration of acronym terms for multiple purposes. An adopter who needs both to refer to an acronym and list the acronym in a published glossary would provide an explanation of the acronym as in the following example:

<glossentry id="abs">
  <glossAcronym>ABS</glossAcronym>
  <glossdef>A brake technology that minimizes skids.</glossdef>
  <glossBody>
    <glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm>
  </glossBody>
</glossentry>

The glossary can include the explained acronym (as shown in the following example) as well as glossary term that aren't acronyms and acronyms that are referenced by not included in the glossary:

<map>
  ...
  <topicref href="maintcar.dita"/>
  ...
  <topichead navtitle="glossary">
    <topicref keys="abs" href="antiLockBrake.dita"/>
    ... other referenced terms in the glossary ...
  </topichead>
  ... key declarations for other referenced acronyms that aren't in the glossary ...
</map>

The adopter can still refer to the acronym with the <abbreviation> element as in the following example:

<task id="maintcar">
  ...
    <info>The <abbreviation keyref="abs"/> will prevent the car from skidding ...</info>
  ...
</task>

Processing for term resolution to either the <glossSurfaceForm> or <glossAcronym> text and processing for glossary publishing work as before.

Usage for populating a terminology database

While a number of text analysis tools exist, the challenge for adopters is populating the terminology database that enables use of such tools. Published glossaries provide a practical source for terminology to populate such terminology databases.

An adopter whose requirements include not only acronym resolution and glossary publishing requirements but populating a terminology database can create glossentry topics similar to the following:

<glossentry id="abs">
  <glossAcronym>ABS</glossAcronym>
  <glossdef>A brake technology that minimizes skids.</glossdef>
  <glossBody>
    <glossPartOfSpeech value="noun"/>
    <glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm>
    <glossAlt>
      <glossSynonym>Anti-skid Brakes</glossSynonym>
      <glossUsage>Allowed in legacy content but not preferred.</glossUsage>
      <glossStatus value="restricted"/>
    </glossAlt>
  </glossBody>
</glossentry>

As illustrated by these example, adopters can scale up for more sophisticated applications as their requirements change by taking advantage of optional elements to provide additional detail about the term.

Acronym terms

This section discusses the subset of the glossentry vocabulary specific to acronyms.

Reference for acronym terms

To use the glossentry topic for acronym resolution, the writer takes advantage of the following four elements

Base element Specialized element Content Purpose
<concept> <glossentry>
  1. one <glossAcronym>
  2. one <glossBody>
Declares an acronym and its surface form.
<title> <glossAcronym> text, <term>, <keyword>, or <tm> content Specifies a preferred term with an <glossAcronym> form.
<conbody> <glossBody>
  1. one <glossSurfaceForm>
Contains detail about the acronym.
<p> <glossSurfaceForm> text, <term>, <keyword>, or <tm> content Specifies an unambiguous presentation of an acronym such as providing the full form with the acronym in parentheses. The surface form is suitable to introduce the term in new contexts.

The <glossentry> topic provides additional subelements that are optional but available to scale up for single sourcing for additional purposes such as glossary publishing of the acronym (see Technical Requirements below).

Two new domains complement the glossary entry topic to make it easy to refer to acronyms (as shown in the example of acronym resolution):

  • A map domain specializes <topicref> to provide a <glossref> element to define a key for an acronym.
  • A topic domain specializes <term> to provide an <abbreviation> element to insert the acronym into the content of another topic.

Rendition of Abbreviated Forms

When the writer provides a keyref to a glossentry topic that contains a <glossSurfaceForm> element, a process can emit the surface form in contexts where the term might be unfamiliar to the reader.

For instance, a process composing a book deliverable might emit the surface form on the first reference to the glossentry topic within the book or for every reference within a copyright or a warranty-related warning. A process generating an online page might emit the surface form as a hover tooltip on every instance of the term. A glossary publishing process would emit the surface form for the term.

For instance, if the topic with the keyref to the "abs" key provided the first appearance of the ABS term within a book, the sentence could be rendered as follows:

"The Anti-lock Brake System (ABS) will prevent the car from skidding in adverse weather conditions."

If the ABS term had appeared previously within the book, the same sentence could instead be rendered as follows:

"The ABS will prevent the car from skidding in adverse weather conditions."

Translation Issues for Abbreviated Forms

The following cases for abbreviated forms must be contemplated when working with documents that require internationalization:

  • The source and target languages may have different forms for a term. One language may lack an abbreviation or acronym that's recognized in the other, or the preferred term may be an abbreviation or acronym in one language but the full form in another.

    Translation workbenches do not ordinarily support markup changes. Thus, an acronym and surface form may be different in the source language but translate to the same text in a target language.

    The following example shows this approach in English:

    <glossentry id="wmd" xml:lang="en">
      <glossAcronym>WMD</glossAcronym>
      <glossBody>
        <glossSurfaceForm>Weapons of Mass Destruction (WMD)</glossSurfaceForm>
      </glossBody>
    </glossentry>

    The Spanish translation can replace the element text without changing the markup:

    <glossentry id="wmd" xml:lang="en">
      <glossAcronym>armas de destrucción masiva</glossAcronym>
      <glossBody>
        <glossSurfaceForm>armas de destrucción masiva</glossSurfaceForm>
      </glossBody>
    </glossentry>

    This approach results means that the element name for the preferred form in the target language is not accurate for the content (because <glossAcronym> stores text that is not an acronym). Because that discrepancy should not affect term resolution or glossary publishing processing, the discrepancy may be acceptable in the target language.

    If the element name must identify the form accurately in both the source and target languages and if the adopter cares only about term resolution and not about flagging the term as an acronym, the acronym declaration can use <glossterm> instead of <glossAcronym> in both the source and target languages and still provide a <glossSurfaceForm>. Again, this approach will not impede term resolution or glossary publishing processing. The following example shows this alternative approach in English:

    <glossentry id="wmd" xml:lang="en">
      <glossterm>WMD</glossterm>
      <glossBody>
        <glossSurfaceForm>Weapons of Mass Destruction (WMD)</glossSurfaceForm>
      </glossBody>
    </glossentry>

    The Spanish translation can replace the text of the preferred and surface forms without changing the markup:

    <glossentry id="wmd" xml:lang="es">
      <glossterm>armas de destrucción masiva</glossterm>
      <glossBody>
        <glossSurfaceForm>armas de destrucción masiva</glossSurfaceForm>
      </glossBody>
    </glossentry>

    In the future, translation tools with awareness of the DITA glossary vocabulary could allow translators to change the markup to reflect the target language, adding or removing alternate forms of the term as appropriate. So long as the glossentry topic declaring the term continued to exist, such practices wouldn't impede term resolution or glossary publishing processing.

  • In some languages, like Spanish, abbreviated-form expansion should be written in lower case. This can lead to a grammatical error if the first appearance of an abbreviated form occurs at the beginning of a sentence. The same problem may arise with the indefinite article in English 'a' or 'an' depending on whether the text to be inserted begins with a vowel. It is up to the composition/display software to handle this. For example, the acronym for AIDS should be translated as:

    <glossentry id="aids" xml:lang="es">
      <glossAcronym>SIDA</glossAcronym>
      <glossBody>
        <glossSurfaceForm>síndrome de inmuno-deficiencia adquirida (SIDA)</glossSurfaceForm>
        <glossAlt>
          <glossFullForm>síndrome de inmuno-deficiencia adquirida</glossFullForm>
        </glossAlt>
      </glossBody>
    </glossentry>

    Normally the <glossSurfaceForm> text from the above example could not be used at the start of a sentence, because it begins with a lower case letter. It is up to the composition software for the given language to cope with this input.

  • Abbreviated forms can cause problems for inflected languages because abbreviated form expansion needs to be presented in the nominative case, without any inflection. This can be achieved with a surface form that provides the full form in parentheses immediately following the acronym. For example, the Polish acronym for the European Union may be:

    <glossentry id="eu" xml:lang="pl">
      <glossAcronym>UE</glossAcronym>
      <glossBody>
        <glossSurfaceForm>UE (Unia Europejska)</glossSurfaceForm>
        <glossAlt>
          <glossFullForm>Unia Europejska</glossFullForm>
        </glossAlt>
      </glossBody>
    </glossentry>

    Using the above construct enables automated handling of the abbreviated form in Polish without causing any problems with grammatical inflection. For example, when stating that something occurred within the EU, the inflected form in Polish caused by the use of the locative case would have to be used. For the actual abbreviated form itself this is not a problem, since abbreviated forms are not inflected. Consider, for example, the phrase "In the European Union (EU) there are many institutions…":

    "W Unii Europejskiej (UE) jest wiele instytucji…"

    However, by allowing the translator to control how the text is displayed via the <glossSurfaceForm> element, the first occurrence for the abbreviated form allows the translator to use the following acceptable construct:

    "W UE (Unia Europejska) jest wiele instytucji…"

All terms

This section provides a discusses the full glossentry markup available for any terminology application.

Longer description

DITA 1.1 introduce a simple glossary specialization to meet basic needs for publication as part of bookmap.

The DITA 1.1 glossary specialization, however, is too simple to support many common glossary applications. For instance, many content publishers need to distinguish an abbreviation from the full term. In addition, a more complete representation of terminology can support processing such as the following:

Translation
The glossary identifies key terminology for human translators as well as the meaning that the term must retain in translation. In addition, the identification of special terms and the terminology data for those terms provides a dictionary that helps to enable automated translation of mentions of terms and content in the vicinity of such mentions.

Key terminology standards include TBX.

Semantic search
The glossary identifies the subjects associated with specific terms, which can enable indexing content based on the meaning of terms rather than the surface forms in the text.

Key semantic standards include TopicMaps and SKOS.

Handling of abbreviated forms

Abbreviated forms, such as acronyms, are ubiquitous in technical documentation. Abbreviated forms are a special case of glossary term because they need to be expanded to the full form under some conditions (such as the first encounter within a printed document). In electronic published documents, abbreviated form expansions can also be made available in the form of a hyperlink or 'tool tip' mechanism. In addition, the expanded text of abbreviated forms should be available for automatic inclusion in glossary entries for the publication. This proposal relates to all types of abbreviations, such as acronyms, initialisms, apocope, clipping, elision, syncope, syllabic abbreviation, and portmanteau.

To enable these applications, DITA 1.2 allows additional detail about the term and additional methods for referring to terms that can deliver either abbreviated or surface forms of the term.

Statement of Requirement

The following requirements apply to glossary terms generally:

  • Supply the preferred term and the verbal definition of the subject of the term including an explanation of the cases that are in or out of scope for the term.
  • List synonyms, abbreviations, acronyms, and other alternative terms with the same meaning. (Called a synomym set by WordNet; please see http://en.wikipedia.org/wiki/Synsets.)
  • Directions for the correct use of the preferred or alternate terms.
  • Extensibility for more detailed terminology data.

In addition, abbreviated forms and their translations require special handling:

  • Some abbreviated forms are never translated, especially those that are intended for a knowledgeable, technical audience, as well as those that refer to standardized international concepts, such as XML.
  • Some abbreviated forms represent a brand name for which the original expanded form is no longer used or is secondary to the abbreviated forms.
  • Abbreviated forms such as xml, jpg, and html are typically used in their original form, that is, they may be quoted in lower case, and they are not translated.
  • Abbreviated forms that have equivalent expressions in other languages are typically translated. United Nations (UN) and Weapons of Mass Destruction (WMD) have equivalents in other languages besides English. For instance, the French translation of “UN” is “ONU”.
  • Some abbreviated forms are translated for clarity and also referred to in their original untranslated form. For instance, OASIS may be translated so that readers understand its significance in their native language but the original acronym would be retained in the translation to facilitate electronic search.
  • The expanded form of an abbreviated form in the target language may require a different formulation than the expanded form of the abbreviated form in the source language, depending on the target audience and the grammatical features of the target language.

For example, the expansion of an abbreviated form in English might consist of the abbreviated form followed by its full form in parentheses. By contrast, the translated version might consist of the expanded form followed by the abbreviated form in parentheses. The translated version might also include the English and the translation.

For example, in a Polish book on Java Web programming, the first reference to JSP may appear as follows:

"JSP (ang. Java Server Pages)"

Another example from a publication concerning OASIS:

"OASIS (ang. Organization for the Advancement of Structured Information Standards—organizacja dla propagowania strukturalnych standardów infomracyjnych)"

In the first example, the translator assumes the reader will not require a translation of the English abbreviated form. In the second example, the translator assumes the reader may not understand the English expanded form and therefore adds the translation.

Use Cases

  • Publishing an explanatory glossary for users.
  • Providing referenceable terms that can resolve to expanded forms when appropriate and that can resolve in different ways in translation.
  • Guiding writers on the correct use of terms.
  • Informing translators on the correct interpretation of terms.
  • Keeping a terminology database up to date by importing maintained terms from content sources.
  • Supplying automated translation, text analysis, or search indexing tools with linguistic data.
  • Defining subjects for classification.

Scope

Moderate: adding elements to one specialized topic, providing a map domain for defining keys, and providing an element domain for referring to keys.

Technical Requirements

The full set of elements provided by the expanded glossentry topic includes the following elements:

Base element Specialized element Content Purpose
<concept> <glossentry>
  1. one <glossterm>, <glossAbbreviation>, or <glossAcronym>
  2. zero or one <glossdef>
  3. zero or one <prolog>
  4. zero or one <glossBody>
  5. zero or one <related-links>
Specifies the preferred and alternate forms of a term and the subject designated by those terms within a glossary or other kind of terminology set. The <glossAbbreviation> or <glossAcronym> elements can be used instead of <glossterm> to indicate that the preferred term has the specified form.
Note: Some terminology discussions use "concept" to denote the meaning of a term. For the DITA community, however, "concept" has a strong association with the core DITA concept topic type. To avoid confusion, this proposal denotes the meaning of a term with "subject" (which has appropriate connotations by way of "subject classification" and the Dublin Core subject property).
<title> <glossterm> <glossAbbreviation> <glossAcronym> <glossFullForm> <glossShortForm> <glossSynonym> title content for <glossterm> for consistency with DITA 1.1; text, <term>, <keyword>, or <tm> content for the other <title> specializations Identifies the role of one term with respect to other variant terms. The <glossterm>, <glossAbbreviation>, and <glossAcronym> elements can appear within the <glossentry> element to indicate the preferred term. The other <title> specializations can appear within the <glossAlt> element to indicate alternative forms withe the same meaning. In particular, where <glossShortForm> to indicate a shorter alternative to the preferred term and <glossFullForm> can indicate a longer alternative to the preferred term (especially where the preferred term is a <glossAbbreviation> or <glossAcronym> element).
<abstract> <glossdef> section content or <shortdesc> Provides a verbal definition of the subject of a term for writers and users.
<conbody> <glossBody>
  1. zero or one <glossPartOfSpeech>
  2. zero or one <glossStatus>
  3. zero or more <glossProperty>
  4. zero or one <glossSurfaceForm>
  5. zero or one <glossUsage>
  6. zero or one <glossScopeNote>
  7. zero or more <glossSymbol>
  8. zero or more <note>
  9. zero or more <glossAlt>
Represents terminology detail. The part of speech applies to all term forms to encourage consistency of the alternate forms with the preferred term. The surface form presents the term in an unambiguous way. The status indicates the overall status of the subject of the term. The <glossProperty> and <note> elements are extension points for more detail about the preferred term or its subject (such as the linguistic properties from basic or full TBX).
<data> <glossPartOfSpeech> value attribute enumerated as noun, properNoun, verb, adjective, or adverb; empty content Identifies the part of speech for the preferred and alternate terms. Alternate terms must have the same part of speech as the preferred term because all terms in the glossentry topic designate the same subject. If the part of speech isn't specified, the default is a noun for the standard enumeration.
Note: The standard enumeration is extensible or replaceable. The enumeration is validated by means of the proposed controlled values mechanism or through processing rather than validated as an XML enumeration.
<data> <glossStatus> value attribute enumerated as restricted, prohibited, or obsolete; empty content Identifies the usage status of a preferred or alternate term. If the status isn't specified, the preferred term provides a preferred term and an alternate term provides an allowed term.
Note: This enumeration must be extensible or replaceable. The enumeration is validated by means of the proposed controlled values mechanism or through processing rather than validated as an XML enumeration.
<data> <glossProperty> data content An extension point for linguistic or semantic properties such as the gender of the term.
<p> <glossSurfaceForm> text, <term>, <keyword>, or <tm> content Specifies an unambiguous presentation of the term that may combine multiple forms. For instance, for an acronym, the <glossSurfaceForm> might provide the full form as well as the acronym in parentheses. The surface form is suitable to introduce the term in new contexts.
<note> <glossUsage> note content Any information about the correct usage of the term.
<note> <glossScopeNote> note content A clarification of the subject designated by the terms such as examples of included or excluded companies or products. For instance, a scope note for "Linux" might explain that the term doesn't apply to UNIX products and give some examples of Linux products that are included as well as UNIX products that are excluded.
<image> <glossSymbol> image content Identifies a standard icon associated with the subject of the term.
<section> <glossAlt>
  1. one <glossAbbreviation>, <glossAcronym>, <glossFullForm>, <glossShortForm>, or <glossSynonym>
  2. zero or one <glossStatus>
  3. zero or more <glossProperty>
  4. zero or one <glossUsage>
  5. zero or more <note>
  6. zero or more <glossAlternateFor>
Identifies a variant term for the preferred term. Any list of alternative terms is, of course, specific to the language and may get longer or shorter during translation.
<xref> <glossAlternateFor> Empty content Indicates when a variant term has a relationship to another variant term as well as to the preferred term.

The following example shows the minimum declaration of a term:

<glossentry id="highavail">
    <glossterm>High Availability</glossterm>
</glossentry>

The following example shows a detailed glossary entry specifying the usage for the preferred and alternate terms:

<glossentry id="usbfd">
  <glossterm>USB flash drive</glossterm>
  <glossdef>A small portable drive.</glossdef>
  <glossBody>
    <glossPartOfSpeech value="noun"/>
    <glossUsage>Do not provide in upper case (as in "USB Flash Drive") because that suggests a trademark.</glossUsage>
    <glossAlt>
      <glossAcronym>UFD</glossAcronym>
      <glossUsage>Explain the acronym on first occurrence.</glossUsage>
    </glossAlt>
    <glossAlt id="memoryStick">
      <glossSynonym>memory stick</glossSynonym>
      <glossUsage>This is a colloquial term.</glossUsage>
    </glossAlt>
    <glossAlt>
      <glossAbbreviation>stick</glossAbbreviation>
      <glossStatus value="prohibited"/>
      <glossUsage>This is too colloquial.</glossUsage>
      <glossAlternateFor href="javascript:void(0);"/>
    </glossAlt>
    <glossAlt>
      <glossAbbreviation>flash</glossAbbreviation>
      <glossStatus value="prohibited"/>
      <glossUsage>This short form is ambiguous.</glossUsage>
    </glossAlt>
  </glossBody>
</glossentry>

Using the standard keyref mechanism, the writer can assign a key to the declaration topic and refer to the key to insert the preferred term. The benefit in using a reference is that the preferred term can be maintained in one place:

<map>
  ...
  <topicref keys="reliability" href="highavail.dita"
        linking="none" toc="no" print="no" search="no"/>
  ...
  <topicref href="configdb.dita"/>
  ...
</map>

<task id="configdb">
  <title>Configuring the database.</title>
  ...
    <context>To enable <term keyref="reliability"/>,
        you configure the database</context>
  ...
</task>

Two new domains support easy definition and use of keys for glossary entry topics:

  • A map domain specializes <topicref> to provide a <glossref> element. The <glossref> element requires the keys and href attributes and defaults the print, search, and toc attributes to the "no" value and the linking attribute to the "none" value. The <glossref> element is always empty.
  • A topic domain specializes <term> to provide an <abbreviation> element. The <abbreviation> element requires the keyref attribute and is always empty.

Writers can set the linking attribute to the "target" value on the <glossref> element to enable linking from the use to the glossary term. The <glossref> element is only a convenience. Writers can always use the standard capabilities of the keyref mechanism. For instance, writers can use the <topicref> element with a keys attribute to pull a glossary topic into a TOC context while defining a key.

The <abbreviation> element is also a convenience. Writers can use the <term> element with a keyref attribute to refer to a glossentry regardless of whether the preferred form of the term is an abbreviation or not. Processing inserts text from the glossentry topic only when the referencing <term> element doesn't contain text. As a result, writers can use the <term> element to delimit terms within content while identifying the corresponding glossary entry. That is, the <term> element can provide a context-specific surface form as its content where appropriate.

For authoring convenience, a <glossgroup> topic can contain multiple <glossentry> topics:

Base Element Content Purpose
<concept> <glossgroup>
  1. one <title>
  2. zero or one <prolog>
  3. zero or more <glossgroup> or <glossentry> topics
Groups a set of glossary entries for some purpose, for instance, for convenient maintenance based on the alphabetic collation of the preferred terms or on the subject matter covered by the terms.

Relationships between the subjects of terms (such as the hypernym or kind-of relationship and the holonym or part-of relationships specified by WordNet) can be specified for glossary topics by a subject scheme map. (Please see the Proposal 12031 for Controlled Values.)

New or Changed Specification Language

The Language Reference for the glossentry topic should be revised to reflect the contents of this proposal including translation considerations and their impact on the use of abbreviations.

Costs

  • Implementation of the DTD and Schema changes for the glossentry topic, of the map domain for the <glossref> element, of the topic domain for the <abbreviation> element, and of the glossgroup topic.

  • Implementation of special processing to emit the surface form when appropriate.

Benefits

  • Glossary data can serve publishing, text analysis, search, translation processing, and other applications.

    In particular, abbreviated forms can be handled in a uniform and consistent manner by putting resolution of the abbreviated form under the control of the composition software so that glossary, tooltip, and first forms can be provided as required to meet the end-user requirements.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]