dita-translation message

Subject: RE: [dita] Issue for unifying acronyms and glossary

From: "JoAnn Hackos" <joann.hackos@comtech-serv.com>
To: "Erik Hennum" <ehennum@us.ibm.com>,<dita-translation@lists.oasis-open.org>
Date: Wed, 19 Dec 2007 06:31:59 -0700

Erik,

Please include the Translation SC in these emails. Everyone on the SC has a voice.

What happened to the expanded form? We actually need everything that is in our original proposal. Each item has a distinct purpose. Remember that under no circumstances can translators change the XML.

JoAnn

JoAnn T. Hackos, PhD
President
Comtech Services, Inc.
710 Kipling Street, Suite 400
Denver CO 80215
303-232-7586
joann.hackos@comtech-serv.com

From: Erik Hennum [mailto:ehennum@us.ibm.com]
Sent: Tuesday, December 18, 2007 10:19 AM
To: JoAnn Hackos
Cc: Andrzej Zydron; dita@lists.oasis-open.org; Gershon L Joseph; Ogden, Jeff; Kara Warburton
Subject: RE: [dita] Issue for unifying acronyms and glossary

Hi, Gershon and JoAnn:

I agree entirely that it would be good to take a little more time with this. We really don't want to rush in something that will create problems for years to come.

I'm wondering if the following refinements would improve the integrated proposal for the minimalist term referencing use cases (including acronym referencing):

* Make <glossdef> optional.
* Make <glossPartOfSpeech> optional (assumed to default to noun if not specified).
* Move <glossSurfaceForm> to <glossBody> and specialize from <p> on the grounds that the surface form will never have usage or linguistic properties; ie, it's better to treat the surface form as a property of the preferred form of the term rather than as an alternate form.

That allows minimalist declarations of terms like the following:

minimalist glossentry declarations for other terms

Note that the acronym is represented as a glossterm. That handles the case where translation workbench software isn't able to change elements -- that is, the translator doesn't change the element. As specified in the current integrated proposal, references to the abs topic via keyref will resolve to <glossterm> text in most cases but to the <glossSurfaceForm> text where the more complete form is appropriate.

The best practice for glossentries would be much more complete similar to the following (where translation software can modify markup):

best practice glossentries for other terms

Such best practices could be enforced by constraints.

That still leaves the case where someone wants to create precise terminology declarations but has translation software that can't handle markup changes. One possibility would be to identify that the term is a preferred term with <glossterm> but also identify the term as an acronym with the <glossAcronym> element in the <glossAlt> section. For that approach to work, processors would have to detect the duplication:

compromise glossentries for other terms

Does that scale from term referencing to glossary publishing to terminology capture for automated text analysis?

Do people see issues with that approach?

Thanks,

Erik Hennum
ehennum@us.ibm.com

"JoAnn Hackos" <joann.hackos@comtech-serv.com>

"JoAnn Hackos" <joann.hackos@comtech-serv.com>

12/18/2007 08:25 AM

To	"Gershon L Joseph" <gershon@tech-tav.com>, Erik Hennum/Oakland/IBM@IBMUS, <dita@lists.oasis-open.org>
cc	"Andrzej Zydron" <azydron@xml-intl.com>, "Ogden, Jeff" <jogden@ptc.com>, "Kara Warburton" <KARA@CA.IBM.COM>
Subject	RE: [dita] Issue for unifying acronyms and glossary

I agree with Gershon,
This is not possible. We do not want translators who don’t know anything about DITA to add XML markup to anything.
JoAnn

JoAnn T. Hackos, PhD
President
Comtech Services, Inc.
710 Kipling Street, Suite 400
Denver, CO 80215
303-232-7586
joann.hackos@comtech-serv.com
joannhackos Skype
www.comtech-serv.com

From: Gershon L Joseph [mailto:gershon@tech-tav.com]
Sent: Tuesday, December 18, 2007 9:07 AM
To: Erik Hennum; dita@lists.oasis-open.org
Cc: JoAnn Hackos; Andrzej Zydron; Ogden, Jeff; Kara Warburton
Subject: Re: [dita] Issue for unifying acronyms and glossary

Hi Erik,

I thought the surface form was intended to be used by the translators when a straight translation of the acronym is not appropriate for the target language.

I don't think option 2 is viable. It's not only way beyond current translation software, it's also going to be an issue for content management systems. They need to maintain a relationship between the source language and each translation, to identify what's changed in the source since the previous translation was done. I'm not aware of any CMS today that would support keeping track of changing markup in addition to the date of the source content.

I suspect we'll need more time to hash this one out. I can't see us getting to a final proposal today.

Gershon

----- Original Message ----
From: Erik Hennum <ehennum@us.ibm.com>
To: dita@lists.oasis-open.org
Cc: JoAnn Hackos <joann.hackos@comtech-serv.com>; Andrzej Zydron <azydron@xml-intl.com>; Gershon L Joseph <gershon@tech-tav.com>; "Ogden, Jeff" <jogden@ptc.com>; Kara Warburton <KARA@CA.IBM.COM>
Sent: Tuesday, December 18, 2007 12:02:57 AM
Subject: [dita] Issue for unifying acronyms and glossary

Hi, Terminology Enthusiasts:

JoAnn and I had a useful conversation about the integrated acronym and glossary proposal:

http://www.oasis-open.org/apps/org/workgroup/dita/download.php/26484/IssueGlossary12026.html

To refresh, the rationale for integrating the acronym and glossary proposals is to let user declare a term once for all terminology purposes instead of declaring the same term in different ways for glossary publication, text analysis dictionaries, and localization. The principle of defining a thing once and referring to it multiple times (also known as DRY or Don't Repeat Yourself) is generally accepted as important for good design in XML. In particular, it would be quite unfortunate if DITA had two different ways to declare and reference an acronym and its full form.

While JoAnn is still reviewing the integrated proposal, the conversation identified one issue: how to handle cases where an acronym or abbreviation is the preferred term in the original language but doesn't exist or isn't the preferred term in the target language.

The current integrated proposal envisions that the translator will change the element for the preferred term from <glossAcronym> to <glossterm> -- in essence, creating an appropriate glossentry for the target language. Current translation workbench software, however, typically allows a translator to edit content but not to change elements.

Two ways of handling this case occur to me:

1. Modify the currently specified expectations for linktext resolution behavior for glossary references to fall back to <glossFullForm> when the preferred term is empty and to <glossSurfaceForm> is not appropriate. This behavior would be a lightweight addition for processes that get the linktext from <glossSurfaceForm> in some contexts.

2. Expect that translation workbench software will distinguish terminology declaration markup from content and support translators in modifying term declaration markup to create a glossentry suitable for the locale but not in in changing the content markup.

The second approach would seem to provide greater flexibility. For instance, the second approach can handle the case where an abbreviation is the preferred form in the original language and the abbreviation exists in the target language but the full form is preferred. Also, the second approach would be required if the term is included in a published glossary so the appropriate title is available to topic processing.

However, the second approach does require implementers of translation workbench software to add full support for markup editing, which is significantly different from support for content editing. (In passing, a multi-level <indexterm> provides another case where the number of levels and thus the markup might be appropriate to change in translation.)

It would be possible to specify both approaches as expectations. A translation workbench that supports locale-specific terminology declarations won't create cases that take advantage of the fallback processing; other translation workbenches will.

Do you see other ways of resolving this problem?

Erik Hennum
ehennum@us.ibm.com

Follow-Ups:
- RE: [dita] Issue for unifying acronyms and glossary
  - From: Kara Warburton <KARA@CA.IBM.COM>