dita-translation message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: RE: [dita] Issue for unifying acronyms and glossary
- From: "JoAnn Hackos" <joann.hackos@comtech-serv.com>
- To: "Erik Hennum" <ehennum@us.ibm.com>,<dita-translation@lists.oasis-open.org>
- Date: Wed, 19 Dec 2007 06:31:59 -0700
Erik,
Please include the Translation SC in these emails. Everyone
on the SC has a voice.
What happened to the expanded form? We actually need
everything that is in our original proposal. Each item has a distinct purpose.
Remember that under no circumstances can translators change the
XML.
JoAnn
JoAnn T. Hackos, PhD
President
Comtech Services,
Inc.
710 Kipling
Street, Suite 400
Denver CO 80215
303-232-7586
joann.hackos@comtech-serv.com
Hi, Gershon and JoAnn:
I agree entirely that it would be good to take
a little more time with this. We really don't want to rush in something that
will create problems for years to come.
I'm wondering if the following
refinements would improve the integrated proposal for the minimalist term
referencing use cases (including acronym referencing):
* Make
<glossdef> optional.
* Make <glossPartOfSpeech> optional (assumed
to default to noun if not specified).
* Move <glossSurfaceForm> to
<glossBody> and specialize from <p> on the grounds that the surface
form will never have usage or linguistic properties; ie, it's better to treat
the surface form as a property of the preferred form of the term rather than as
an alternate form.
That allows minimalist declarations of terms like the
following:
<glossgroup id="carterms">
<title/>
<glossentry
id="abs">
<glossterm>ABS</glossterm>
<glossBody>
<glossSurfaceForm>Anti-lock
Braking System
(ABS)</glossSurfaceForm>
</glossBody>
</glossentry>
...
minimalist glossentry declarations for other terms
...
</glossgroup>
Note that the acronym is represented as a
glossterm. That handles the case where translation workbench software isn't able
to change elements -- that is, the translator doesn't change the element. As
specified in the current integrated proposal, references to the abs topic via
keyref will resolve to <glossterm> text in most cases but to the
<glossSurfaceForm> text where the more complete form is
appropriate.
The best practice for glossentries would be much more
complete similar to the following (where translation software can modify
markup):
<glossgroup id="carterms">
<title>Car
terminology</title>
<glossentry
id="abs">
<glossAcronym>ABS</glossAcronym>
<glossdef>A
brake technology that minimizes
skids.</glossdef>
<glossBody>
<glossSurfaceForm>Anti-lock
Braking System (ABS)</glossSurfaceForm>
<glossPartOfSpeech
value="noun"/>
<glossAlt>
<glossFullForm>Anti-lock
Braking
System</glossFullForm>
</glossAlt>
</glossBody>
</glossentry>
...
best practice glossentries for other terms
...
</glossgroup>
Such best practices could be enforced by
constraints.
That still leaves the case where someone wants to create
precise terminology declarations but has translation software that can't handle
markup changes. One possibility would be to identify that the term is a
preferred term with <glossterm> but also identify the term as an acronym
with the <glossAcronym> element in the <glossAlt> section. For that
approach to work, processors would have to detect the duplication:
<glossgroup id="carterms">
<title>Car
terminology</title>
<glossentry
id="abs">
<glossterm>ABS</glossterm>
<glossdef>A
brake technology that minimizes
skids.</glossdef>
<glossBody>
<glossSurfaceForm>Anti-lock
Braking System (ABS)</glossSurfaceForm>
<glossPartOfSpeech
value="noun"/>
<glossAlt>
<glossAcronym>ABS</glossAcronym>
</glossAlt>
<glossAlt>
<glossFullForm>Anti-lock
Braking
System</glossFullForm>
</glossAlt>
</glossBody>
</glossentry>
...
compromise glossentries for other terms
...
</glossgroup>
Does that scale from term referencing to
glossary publishing to terminology capture for automated text
analysis?
Do people see issues with that
approach?
Thanks,
Erik
Hennum
ehennum@us.ibm.com
"JoAnn Hackos"
<joann.hackos@comtech-serv.com>
"JoAnn Hackos"
<joann.hackos@comtech-serv.com>
12/18/2007 08:25 AM |
|
I agree with Gershon,
This is not possible. We do not want translators who don’t know
anything about DITA to add XML markup to anything.
JoAnn
JoAnn T. Hackos, PhD
President
Comtech Services,
Inc.
710 Kipling Street, Suite 400
Denver, CO
80215
303-232-7586
joann.hackos@comtech-serv.com
joannhackos Skype
www.comtech-serv.com
From: Gershon L Joseph
[mailto:gershon@tech-tav.com]
Sent: Tuesday,
December 18, 2007 9:07 AM
To: Erik Hennum; dita@lists.oasis-open.org
Cc: JoAnn Hackos; Andrzej Zydron;
Ogden, Jeff; Kara Warburton
Subject: Re: [dita] Issue for
unifying acronyms and glossary
Hi Erik,
I thought
the surface form was intended to be used by the translators when a straight
translation of the acronym is not appropriate for the target language.
I
don't think option 2 is viable. It's not only way beyond current translation
software, it's also going to be an issue for content management systems. They
need to maintain a relationship between the source language and each
translation, to identify what's changed in the source since the previous
translation was done. I'm not aware of any CMS today that would support keeping
track of changing markup in addition to the date of the source content.
I
suspect we'll need more time to hash this one out. I can't see us getting to a
final proposal today.
Gershon
----- Original Message ----
From: Erik Hennum
<ehennum@us.ibm.com>
To: dita@lists.oasis-open.org
Cc: JoAnn Hackos
<joann.hackos@comtech-serv.com>; Andrzej Zydron
<azydron@xml-intl.com>; Gershon L Joseph <gershon@tech-tav.com>;
"Ogden, Jeff" <jogden@ptc.com>; Kara Warburton
<KARA@CA.IBM.COM>
Sent: Tuesday, December 18, 2007 12:02:57
AM
Subject: [dita] Issue for unifying acronyms and glossary
Hi, Terminology Enthusiasts:
JoAnn
and I had a useful conversation about the integrated acronym and glossary
proposal:
To refresh, the rationale for integrating the
acronym and glossary proposals is to let user declare a term once for all
terminology purposes instead of declaring the same term in different ways for
glossary publication, text analysis dictionaries, and localization. The
principle of defining a thing once and referring to it multiple times (also
known as DRY or Don't Repeat Yourself) is generally accepted as important for
good design in XML. In particular, it would be quite unfortunate if DITA had two
different ways to declare and reference an acronym and its full form.
While JoAnn is still reviewing the integrated proposal, the conversation
identified one issue: how to handle cases where an acronym or abbreviation is
the preferred term in the original language but doesn't exist or isn't the
preferred term in the target language.
The current integrated proposal
envisions that the translator will change the element for the preferred term
from <glossAcronym> to <glossterm> -- in essence, creating an
appropriate glossentry for the target language. Current translation workbench
software, however, typically allows a translator to edit content but not to
change elements.
Two ways of handling this case occur to me:
1.
Modify the currently specified expectations for linktext resolution behavior for
glossary references to fall back to <glossFullForm> when the preferred
term is empty and to <glossSurfaceForm> is not appropriate. This behavior
would be a lightweight addition for processes that get the linktext from
<glossSurfaceForm> in some contexts.
2. Expect that translation
workbench software will distinguish terminology declaration markup from content
and support translators in modifying term declaration markup to create a
glossentry suitable for the locale but not in in changing the content markup.
The second approach would seem to provide greater flexibility. For
instance, the second approach can handle the case where an abbreviation is the
preferred form in the original language and the abbreviation exists in the
target language but the full form is preferred. Also, the second approach would
be required if the term is included in a published glossary so the appropriate
title is available to topic processing.
However, the second approach does
require implementers of translation workbench software to add full support for
markup editing, which is significantly different from support for content
editing. (In passing, a multi-level <indexterm> provides another case
where the number of levels and thus the markup might be appropriate to change in
translation.)
It would be possible to specify both approaches as
expectations. A translation workbench that supports locale-specific terminology
declarations won't create cases that take advantage of the fallback processing;
other translation workbenches will.
Do you see other ways of
resolving this problem?
Erik
Hennum
ehennum@us.ibm.com
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]