OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] best practice: should extraction process create target elements?


Title: Message
Hi Doug,
 
You did a really nice job showing the different ways of looking at this.  I really liked thinking about this.
 
But I don't think this topic is one that lends itself to a best practice.  I think the answer to "should the extraction process create target elements" is "it depends."
 
My extraction process for my translation provider is a good example.
 
Factors include whether it is a new translation, or an update to a previous translation; whether the translation includes "boilerplate" material; etc.
 
Here's a mythical sample of an update to a previous translation:
 
<book>
  <warranty state="boilerplate>Some warranty here</warranty>
  <chapter state="new">This is my first chapter</chapter>
  <chapter state="updated">In July we said hello. Now in November we say goodbye.</chapter>
</book>
 
We extract into this XLIFF:
 
<t-u>
 <source>Some warranty here</source>
</t-u>
<!-- notice, no target.  The translation provider needs the source, let's say for context,
      but is locked out of the boilerplate -->
 
<t-u>
 <source>This is my first chapter</source>
 <target>This is my first chapter</target>
</t-u>
 <!-- notice there is a target here. 
      The translation provider finds it most efficient to type the translation directly into the target. 
      We find it best if they only enter text, not tags-->
 
<t-u>
 <source>In July we said hello. Now in November we say goodbye.</source>
 <target>Es julio. Decimos hola.</target>
</t-u>
<!-- our translation provider finds it useful to see the previous translation,
      even thought the text has changed.  They type over the previous translation
      with the new translation.  It seemed odd to me at first.  But they insisted this
      helps them -->
 
The practice of when to generate target elements was driven more by our translation providers (each of whom are represented in the TC), than by me as a developer.
 
Don't think there is a best practice when contemplating generating target elements as part of the extraction, any more than I think there's a best practice when contemplating whether or not to use a skeleton file.  The answer to each, as I see it, seems to be, "it depends."
 
Thanks for this thought provoking topic,
 
Bryan
 
-----Original Message-----
From: Doug Domeny [mailto:ddomeny@ektron.com]
Sent: Thursday, March 09, 2006 7:16 AM
To: xliff@lists.oasis-open.org
Subject: [xliff] best practice: should extraction process create target elements?

All,

 

As most of you have probably guessed by now, Rodolfo and I have been ironing out some interoperability issues between the XLIFF produced by the Ektron CMS and the Heartsome XLIFF editor. As a result, I have a question of best practice.

 

Tony, would it be appropriate to have an online vote on this?

 

 

Should the extraction process create a target element with a copy of the source in it?

 

  1. No, the extraction process should only create the source tag.

 

              <trans-unit id="16" datatype="plaintext">

                <source>Hospital Wide</source>

              </trans-unit>

 

  1. Yes, but the target should be empty and the state=”needs-translation”.

 

              <trans-unit id="16" datatype="plaintext">

                <source>Hospital Wide</source>

                <target state="needs-translation"/>

              </trans-unit>

 

  1. Yes, the target should be a copy of the source and the state=”needs-translation”.

 

              <trans-unit id="16" datatype="plaintext">

                <source>Hospital Wide</source>

                <target state="needs-translation">Hospital Wide</target>

              </trans-unit>

 

Once a decision is made, I recommend that the XLIFF 1.2 specification be modified to state the recommended practice. The segmentation section is clear in that is states:

 

“It is important to note that the manipulation / segmentation of trans-unit elements is owned by the "translator" domain, not at the extraction filter domain. This means that segmentation will be performed by the editing tool or possibly an automated segmentation process.”

 

I’m willing to draft the changes once the best practice is determined.

 

 

 

Here are my thoughts on it so far.

 

 

PROPOSITION: No target element when extracting

 

PROS

 

1. Easier to visually see which trans-units (TUs) have been translated and which need to be translated.

 

2. It would reduce the size of the XLIFF file after the extraction process.

 

3. XLIFF editors would know translation is needed (no target tag) without checking for state=”needs-translation”.

 

 

CONS

 

1. Translator wishing to ‘type over’ the original so as to retain inline tags would need to copy source, which may be easy in the XLIFF editor. However, if this is required most of the time, it would be better to avoid this step. Of course, the XLIFF editor could automatically copy.

 

2. Translator would need to copy from source to target in order to keep source that is the same when translated, such as with a proper name.

 

3. If trans-unit (TU) is skipped or XLIFF is merged without translating (e.g., when testing), then the merge process would need to replace with source or skeleton, which is probably a good idea anyhow.

 

 

While researching, I found that the open-language-tools subsegmenter utility states:

 

“The XLIFF SubSegmenter takes an existing XLIFF, segmented at the paragraph level, and re-segments it to sentence level. Of course, the incoming XLIFF file must only contain source segments - it doesn't do any complex source/target sentence-alignment functionality.”

 

Given this, I’m definitely leaning toward creating just the source element during extraction. I guess the thing that threw me was the “needs-translation” state.

 

Is the “needs-translation” state redundant? Should it be deprecated?

 

Regards,

 

Doug Domeny

Software Analyst

 

Ektron, Inc.

+1 603 594-0249 x212

http://www.ektron.com

 

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]