xliff-seg message

Subject: RE: [xliff-seg] Segmentation proposals for the main XLIFF TC
From: "Magnus Martikainen" <magnus@trados.com>
To: "Andrzej Zydron" <azydron@xml-intl.com>,<xliff-seg@lists.oasis-open.org>
Date: Mon, 25 Oct 2004 18:10:24 -0700
Hi Andrzej,

Thank you for your proposal - it is a great starting point for discussions.
Here are my thoughts:

1) For which attribute/element are you proposing that we add these attribute values? I think we must be a bit more specific.

Also, as couple of explicit examples are always useful to highlight the issues and make them easy to talk about.

1.1) No comments - I agree with your proposal.

1.2) I don't feel entirely comfortable with your suggestion here. It seems unnatural to me to put the entire translation in the first element and leave the other <trans-unit> elements with empty <target>. 

Also, my gut feeling says that we should be more explicit about exactly which <trans-unit> elements are merged in the translation. 

An alternative option could be to use a <group> around the "merged" <trans-unit> elements. We could allow the translation to be spread out across the different <trans-unit> elements (e.g. with a roughly equivalent length in each target), and we could make use of the mechanism proposed in 1.1) to indicate that the <target> elements do not contain an exact translation of the <source>.

This would allow us to e.g. support validation of length restrictions on the individual <trans-unit> elements, should that be desirable.

The following example illustrates the principle of this mechanism (not intended literally):

<group merged-translations="yes">
  <trans-unit id="1" not-equivalent="true">
    <source>The text goes on,</source>
    <target>Texten går vidare</target>
  </trans-unit>
  <trans-unit id="2" not-equivalent="true">
    <source>and on, and on, and on.</source>
    <target>och vidare, och vidare.</target>
  </trans-unit>
</group>


2) I think we should explicitly point out what we have identified that could be problematic with this approach (which is why we are working on coming up with a mechanism that better support segmentation in XLIFF), e.g.:

* Segmentation cannot easily be changed during translation.

* The very same content appears in and must be worked on in multiple different XLIFF representations. There is added complexity in managing the different formats and conversions between them. Automatically propagating changes between different representations can be difficult or in some circumstances impossible to do with 100% certainty.

* Running custom developed tools for the original XLIFF format (e.g. for verification or previewing) is not easily accomplished once the file has been converted for segmentation.

* It is not clear how <alt-trans> content in the original representation could be converted or represented in a segmented representation.


Let's discuss the details in the meeting tomorrow.

Best regards,
Magnus

-----Original Message-----
From: Andrzej Zydron [mailto:azydron@xml-intl.com] 
Sent: Saturday, October 02, 2004 4:29 AM
To: xliff-seg@lists.oasis-open.org
Subject: [xliff-seg] Segmentation proposals for the main XLIFF TC

Hi Everyone,

With regard to last Tuesday's meeting, I would like to propose that we are ready 
to submit the following points to the main XLIFF committee:

1) The following two extra attribute values are required to enable a translator 
to indicate that the target trans-unit is not a direct translation of the source:

1.1) "not-equivalent" - this attribute value indicates that the target text 
should not be loaded into any leveraged memory database. This can occur if the 
source relates to market specific details that are not relevant regarding 
leveraged memory.

1.2) "merged" - this attribute value indicates that the target text relates to 
one or more trans-unit source text elements. This can happen when the source 
text has been segmented in such a way that a translation of the individual 
elements is impossible without recourse to a "merged" translation against the 
head element. In such an instance the text for the encompassed target elements 
must be empty to indicate that the translation is merged with the head element. 
The next non empty target element indicates the end of the run of "merged" elements.

2) We have established that a perfectly viable way of handling segmentation 
through the means of treating an XLIFF file as a normal XML file. The XLIFF file 
target elements, which at the time contain the source text, can have 
segmentation information added by means of a segmentation namespace such as 
xml:tm using SRX rules. A normal XML XLIFF extraction can then be executed on 
the file using either an XSLT transformation, or program. The resultant skeleton 
file will enable the translated text to be merged with the original XLIFF 
document. An XSLT transformation can then be used to strip out the segmentation 
namespace, resulting in a "translated" original unsegmented XLIFF file. This 
solution is ideal for a production process that can handle pipeline 
transformations. It is not necessarily suited to interactive segmentation that 
is being executed within a user interface centered environment. The XLIFF 
segmentation sub-committee will continue trying to reach a solution for that 
type of environment.

I have provided examples for point 2) that show individual stages of the process.

We may need to rework some of the wording of the points.

Best Regards,

AZ

-- 


email - azydron@xml-intl.com
smail - c/o Mr. A.Zydron
	PO Box 2167
         Gerrards Cross,
         Bucks SL9 8XF
Mobile +(44) 7966 477181
FAX    +(44) 870 831 8868
www - http://www.xml-intl.com

This message contains confidential information and is intended only
for the individual named.  If you are not the named addressee you
may not disseminate, distribute or copy this e-mail.  Please
notify the sender immediately by e-mail if you have received this
e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or error-free
as information could be intercepted, corrupted, lost, destroyed,
arrive late or incomplete, or contain viruses.  The sender therefore
does not accept liability for any errors or omissions in the contents
of this message which arise as a result of e-mail transmission.  If
verification is required please request a hard-copy version. Unless
explicitly stated otherwise this message is provided for informational
purposes only and should not be construed as a solicitation or offer.
Follow-Ups:
- Re: [xliff-seg] Segmentation proposals for the main XLIFF TC
  - From: Andrzej Zydron <azydron@xml-intl.com>