[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Fw: [dita] Issue for unifying acronyms and glossary
I know that Erik is not able to send notes to the dita-translation list, so I'm forwarding this on his behalf. Thanks, Robert D Anderson IBM Authoring Tools Development Chief Architect, DITA Open Toolkit (507) 253-8787, T/L 553-8787 (Good Monday & Thursday) ----- Forwarded by Robert D Anderson/Rochester/IBM on 01/08/2008 07:22 AM ----- Erik Hennum/Oakland/IB M To "Andrzej Zydron" 01/08/2008 12:15 <azydron@xml-intl.com>, Don AM Day/Austin/IBM, "Gershon L Joseph" <gershon@tech-tav.com>, "JoAnn Hackos" <joann.hackos@comtech-serv.com>, "Kara Warburton" <KARA@CA.IBM.COM>, Robert D Anderson/Rochester/IBM, "Rodolfo M. Raya" <rodolfo@heartsome.net> cc Subject RE: [dita] Issue for unifying acronyms and glossary(Document link: Robert D Anderson) Hi, Esteemed Translation and Terminology Experts: Thanks for considering the proposal to unify terminology markup in DITA. I've appended a revision based on the feedback you gave today to Robert: (See attached file: IssueGlossary12026.html) The main changes: Added a usage section at the front to clarifying single sourcing of acronym resolution, glossary publication, and termbase population. Assembled the subset of the full vocabulary used for acronyms as a separate section (labelled "Acronym Terms") so it can be considered easily without confusion. Revised the section on "Translation Issues for Abbreviated Forms." I understand from Robert that there's some concern that the attempt to integrate the proposals is hostile to the acronym proposal and the Translation Subcommittee. That is not my intent in the slightest. The clear value of what the Translation Subcommittee has done is driving the attempt to unify the proposals and to capture the feedback from the Translation Subcommittee. If this revision falls short, please let me know where and how so we can see if we can fix those shortcomings. The concern motivating this effort is improving the environment for DITA adopters who have single sourcing requirements for terminology. Here's a real example. A DITA adopter has content that includes the SMTP acronym. To minimize typos and enable translation, the adopter wants writers to refer to the acronym instead of just embed the acronym in the text. Many end users don't know that the SMTP acronym means Simple Mail Transfer Protocol, so the adopter both wants to expand SMTP on first appearance and wants to include an explanation of the SMTP acronym in the glossary. Should the adopter have to maintain Simple Mail Transfer Protocol as a surface form in two different places -- once for acronym resolution and separately for glossary publication? Wouldn't it be better if the adopter could maintain the SMTP acronym, the Simple Mail Transfer Protocol surface form, and the explanation in one place and use that declaration for all processing (just like other XML single sourcing)? Finally, I wanted to address concerns that the glossary proposal was a recent addition to DITA 1.2: Acronyms were part of the original glossary proposal in 2005: http://www.oasis-open.org/apps/org/workgroup/dita/email/archives/200511/msg00002.html When the original glossary proposal was pared back for DITA 1.1, the explicit intent was to reintroduce abbreviations in a future version. ("For an example of the limitations to be addressed in DITA 1.2, many content publishers will need to distinguish a labelling abbreviation from the full term.") http://www.oasis-open.org/apps/org/workgroup/dita/email/archives/200603/msg00058.html The approved DITA 1.1 proposal also noted that a future version would restore the deferred parts of the original proposal http://www.oasis-open.org/apps/org/workgroup/dita/download.php/1751/Issue14a.html Glossary was on the top 5 list that IBM submitted for DITA 1.2 and was accepted for the official list from the start of work on DITA 1.2: http://www.oasis-open.org/apps/org/workgroup/dita/email/archives/200703/msg00071.html The glossary proposal came up several times during the summer -- for instance: http://www.oasis-open.org/apps/org/workgroup/dita/email/archives/200705/msg00032.html The proposal was brought forward before the deadline for DITA 1.2: http://www.oasis-open.org/apps/org/workgroup/dita/download.php/26130/IssueGlossary12026.html The proposal was scheduled for a vote and approved: http://www.oasis-open.org/apps/org/workgroup/dita/download.php/26192/DitaTCMeetingMinutes071120.txt http://www.oasis-open.org/apps/org/workgroup/dita/download.php/26263/DitaTCMeetingMinutes071127.txt I know that the acronym proposal has its own history. From my perspective, both proposals have strengths, and the union of those strengths would be a better solution for DITA adopters than separate, independent vocabularies. Thanks again for keeping an open mind about this issue, Erik Hennum ehennum@us.ibm.comTitle: DITA Proposed Feature #12026 and #12038
DITA Proposed Feature #12026 and #12038Build on the DITA 1.1 glossary specialization for more complete support of glossary, linguistic, and semantic applications and also to assist in the resolution and handling of abbreviated-form text such as acronyms, general abbreviations, and short forms in source and target text within DITA documents. Terminology applicationsThis section gives examples of how subsets of the glossentry markup meet requirements for different applications. Usage for acronym resolutionAn adopter interested only in term resolution for acronyms can declare an acronym with a glossentry topic similar to the following example: <glossentry id="abs"> <glossAcronym>ABS</glossAcronym> <glossBody> <glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm> </glossBody> </glossentry> The adopter can declare a key for the acronym using the standard DITA 1.2 keyref mechanism: <map> ... <topicref href="maintcar.dita"/> ... <glossref keys="abs" href="antiLockBrake.dita"/> ... key declarations for other referenced acronyms ... </map> The adopter can then refer to the acronym using the standard DITA 1.2 keyref mechanism: <task id="maintcar"> ... <info>The <abbreviation keyref="abs"/> will prevent the car from skidding ...</info> ... </task> Processes can resolve the "abs" reference to the <glossSurfaceForm> text in introductory contexts and to the <glossAcronym> text in other contexts. Usage for glossary publishingAn adopter interested only in traditional glossary publishing can explain one sense of a term with a glossentry topic similar to the following example: <glossentry id="abs"> <glossAcronym>ABS</glossAcronym> <glossdef>A brake technology that minimizes skids.</glossdef> </glossentry> The adopter can then pull together a subset of the defined terms for a deliverable as in the following example: <map> ... <topichead navtitle="glossary"> <topicref href="antiLockBrake.dita"/> ... other terms in the glossary for this deliverable ... </topichead> </map> To produce a traditional glossary, a process can sort the terms included in a deliverable and list the explained senses under each term. Usage for single sourcing term resolution and glossary publishingAdopters don't have to declare the same acronym in different ways for different purposes but instead can single source a declaration of acronym terms for multiple purposes. An adopter who needs both to refer to an acronym and list the acronym in a published glossary would provide an explanation of the acronym as in the following example: <glossentry id="abs"> <glossAcronym>ABS</glossAcronym> <glossdef>A brake technology that minimizes skids.</glossdef> <glossBody> <glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm> </glossBody> </glossentry> The glossary can include the explained acronym (as shown in the following example) as well as glossary term that aren't acronyms and acronyms that are referenced by not included in the glossary: <map> ... <topicref href="maintcar.dita"/> ... <topichead navtitle="glossary"> <topicref keys="abs" href="antiLockBrake.dita"/> ... other referenced terms in the glossary ... </topichead> ... key declarations for other referenced acronyms that aren't in the glossary ... </map> The adopter can still refer to the acronym with the <abbreviation> element as in the following example: <task id="maintcar"> ... <info>The <abbreviation keyref="abs"/> will prevent the car from skidding ...</info> ... </task> Processing for term resolution to either the <glossSurfaceForm> or <glossAcronym> text and processing for glossary publishing work as before. Usage for populating a terminology databaseWhile a number of text analysis tools exist, the challenge for adopters is populating the terminology database that enables use of such tools. Published glossaries provide a practical source for terminology to populate such terminology databases. An adopter whose requirements include not only acronym resolution and glossary publishing requirements but populating a terminology database can create glossentry topics similar to the following: <glossentry id="abs"> <glossAcronym>ABS</glossAcronym> <glossdef>A brake technology that minimizes skids.</glossdef> <glossBody> <glossPartOfSpeech value="noun"/> <glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm> <glossAlt> <glossSynonym>Anti-skid Brakes</glossSynonym> <glossUsage>Allowed in legacy content but not preferred.</glossUsage> <glossStatus value="restricted"/> </glossAlt> </glossBody> </glossentry> As illustrated by these example, adopters can scale up for more sophisticated applications as their requirements change by taking advantage of optional elements to provide additional detail about the term. Acronym termsThis section discusses the subset of the glossentry vocabulary specific to acronyms. Reference for acronym termsTo use the glossentry topic for acronym resolution, the writer takes advantage of the following four elements
The <glossentry> topic provides additional subelements that are optional but available to scale up for single sourcing for additional purposes such as glossary publishing of the acronym (see Technical Requirements below). Two new domains complement the glossary entry topic to make it easy to refer to acronyms (as shown in the example of acronym resolution):
Rendition of Abbreviated FormsWhen the writer provides a keyref to a glossentry topic that contains a <glossSurfaceForm> element, a process can emit the surface form in contexts where the term might be unfamiliar to the reader. For instance, a process composing a book deliverable might emit the surface form on the first reference to the glossentry topic within the book or for every reference within a copyright or a warranty-related warning. A process generating an online page might emit the surface form as a hover tooltip on every instance of the term. A glossary publishing process would emit the surface form for the term. For instance, if the topic with the keyref to the "abs" key provided the first appearance of the ABS term within a book, the sentence could be rendered as follows: "The Anti-lock Brake System (ABS) will prevent the car from skidding in adverse weather conditions." If the ABS term had appeared previously within the book, the same sentence could instead be rendered as follows: "The ABS will prevent the car from skidding in adverse weather conditions." Translation Issues for Abbreviated FormsThe following cases for abbreviated forms must be contemplated when working with documents that require internationalization:
All termsThis section provides a discusses the full glossentry markup available for any terminology application. Longer descriptionDITA 1.1 introduce a simple glossary specialization to meet basic needs for publication as part of bookmap. The DITA 1.1 glossary specialization, however, is too simple to support many common glossary applications. For instance, many content publishers need to distinguish an abbreviation from the full term. In addition, a more complete representation of terminology can support processing such as the following:
To enable these applications, DITA 1.2 allows additional detail about the term and additional methods for referring to terms that can deliver either abbreviated or surface forms of the term. Statement of RequirementThe following requirements apply to glossary terms generally:
In addition, abbreviated forms and their translations require special handling:
For example, the expansion of an abbreviated form in English might consist of the abbreviated form followed by its full form in parentheses. By contrast, the translated version might consist of the expanded form followed by the abbreviated form in parentheses. The translated version might also include the English and the translation. For example, in a Polish book on Java Web programming, the first reference to JSP may appear as follows: "JSP (ang. Java Server Pages)"Another example from a publication concerning OASIS: "OASIS (ang. Organization for the Advancement of Structured Information Standards—organizacja dla propagowania strukturalnych standardów infomracyjnych)"In the first example, the translator assumes the reader will not require a translation of the English abbreviated form. In the second example, the translator assumes the reader may not understand the English expanded form and therefore adds the translation. Use Cases
ScopeModerate: adding elements to one specialized topic, providing a map domain for defining keys, and providing an element domain for referring to keys. Technical RequirementsThe full set of elements provided by the expanded glossentry topic includes the following elements:
The following example shows the minimum declaration of a term: <glossentry id="highavail"> <glossterm>High Availability</glossterm> </glossentry> The following example shows a detailed glossary entry specifying the usage for the preferred and alternate terms: <glossentry id="usbfd"> <glossterm>USB flash drive</glossterm> <glossdef>A small portable drive.</glossdef> <glossBody> <glossPartOfSpeech value="noun"/> <glossUsage>Do not provide in upper case (as in "USB Flash Drive") because that suggests a trademark.</glossUsage> <glossAlt> <glossAcronym>UFD</glossAcronym> <glossUsage>Explain the acronym on first occurrence.</glossUsage> </glossAlt> <glossAlt id="memoryStick"> <glossSynonym>memory stick</glossSynonym> <glossUsage>This is a colloquial term.</glossUsage> </glossAlt> <glossAlt> <glossAbbreviation>stick</glossAbbreviation> <glossStatus value="prohibited"/> <glossUsage>This is too colloquial.</glossUsage> <glossAlternateFor href="javascript:void(0);"/> </glossAlt> <glossAlt> <glossAbbreviation>flash</glossAbbreviation> <glossStatus value="prohibited"/> <glossUsage>This short form is ambiguous.</glossUsage> </glossAlt> </glossBody> </glossentry> Using the standard keyref mechanism, the writer can assign a key to the declaration topic and refer to the key to insert the preferred term. The benefit in using a reference is that the preferred term can be maintained in one place: <map> ... <topicref keys="reliability" href="highavail.dita" linking="none" toc="no" print="no" search="no"/> ... <topicref href="configdb.dita"/> ... </map> <task id="configdb"> <title>Configuring the database.</title> ... <context>To enable <term keyref="reliability"/>, you configure the database</context> ... </task> Two new domains support easy definition and use of keys for glossary entry topics:
Writers can set the linking attribute to the "target" value on the <glossref> element to enable linking from the use to the glossary term. The <glossref> element is only a convenience. Writers can always use the standard capabilities of the keyref mechanism. For instance, writers can use the <topicref> element with a keys attribute to pull a glossary topic into a TOC context while defining a key. The <abbreviation> element is also a convenience. Writers can use the <term> element with a keyref attribute to refer to a glossentry regardless of whether the preferred form of the term is an abbreviation or not. Processing inserts text from the glossentry topic only when the referencing <term> element doesn't contain text. As a result, writers can use the <term> element to delimit terms within content while identifying the corresponding glossary entry. That is, the <term> element can provide a context-specific surface form as its content where appropriate. For authoring convenience, a <glossgroup> topic can contain multiple <glossentry> topics:
Relationships between the subjects of terms (such as the hypernym or kind-of relationship and the holonym or part-of relationships specified by WordNet) can be specified for glossary topics by a subject scheme map. (Please see the Proposal 12031 for Controlled Values.) New or Changed Specification LanguageThe Language Reference for the glossentry topic should be revised to reflect the contents of this proposal including translation considerations and their impact on the use of abbreviations. Costs
Benefits
|
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]