Subject: RE: [xliff-seg] Segmentation sub committee proposal for main XLIFF TC
Hi Andrzej, Thank you for your excellent work! This looks good to me. I have only a few simple comments: *) We should be consistent in examples and text and use "yes" and "no" for all the suggested attribute values (the last couple of examples uses equivalent="false"). *) As the "equivalent" attribute can be applied to groups, should we consider renaming it to "equivalent-translation", just to make it clear that it explicitly refers to the translation, and not to any other properties that the group may span? *) I believe under 3) it may also be important to mention the fact that this approach has a negative impact on the ability to use tools (e.g. verification tools) developed for the original XLIFF format. Let's discuss this in the meeting today. Cheers, Magnus -----Original Message----- From: Andrzej Zydron [mailto:email@example.com] Sent: Saturday, October 30, 2004 2:21 PM To: firstname.lastname@example.org Subject: [xliff-seg] Segmentation sub committee proposal for main XLIFF TC Hi Everyone, The following revised proposal should (hopefully) encompass all of the feedback information from Tuesday's meeting. I have made a few small changes. I have not included the "segment-" prefix to the attributes, because on reflection they do not necessarily refer to segmentation issues. In addition is appears that the use of <group merged-translations="yes"> implies the use of equivalent="no" to all child <trans-unit> elements: The following proposals have been prepared by the XLIFF Segmentation sub-committee for consideration by the main XLIFF Technical Committee: 1) A new attribute should be introduced at the <group> and <trans-unit> element levels that indicates that any child <target> element content is a direct translation of the corresponding <source> element: equivalent (default value "yes") The default value for this attribute is "yes" and as such the attribute can be omitted for all instances where the default applies. Should the attribute be set to "no" this indicates that the translation in any child <target> element is not a direct equivalent of the <source> and SHOULD NOT be loaded into translation memory. This attribute allows any conforming systems to exclude any text items from being loaded into a translation memory system if it has been indicated that the target text is not a direct equivalent of the source text. Example - fixed length fields have forced the translator to place non-equivalent text against individual trans-units in order to display the text, but the individual translations are not a direct equivalent of the source text: <trans-unit id="t.1" equivalent="no"> <source>Constrained text for limited</source> <target>Tekst angielski dla</target> </trans-unit> <trans-unit id="t.2" equivalent="no"> <source>display for English</source> <target>ograniczonego pola</target> </trans-unit> The translation meets the application requirements, but is not a direct translation of the source and should not be loaded into a leveraged translation memory database. 2) A new attribute should be introduced for the <group> element that indicates that the translation of the encompassed <trans-unit> elements must be treated as a whole and not as individual elements: merged-translations (no default value, not mandatory) Example 1) - linguistically the translation only makes sense if the text within the <group> element is taken as a whole: <group merged-translations="yes"> <trans-unit id="1" equivalent="false"> <source>The text goes on,</source> <target>Texten ga*r vidare</target> </trans-unit> <trans-unit id="2" equivalent="false"> <source>and on, and on, and on.</source> <target>och vidare, och vidare.</target> </trans-unit> </group> Example 2) - incorrect segmentation requires that the translation be taken as a whole: <group merged-translations="yes"> <trans-unit id="t1" equivalent="false"> <source>The German acronym v.</source> <target>Niemiecki skrót v. OT oznacza górną pozycję silnika.</target> </trans-unit> <trans-unit id="t2" equivalent="false"> <source>OT signifies the top dead center position for an engine.</source> <target/> </trans-unit> </group> The use of the merged-translations="yes" attribute at the group level implies that any child <trans-unit> elements should have the "equivalent" attribute set to "no". 3) Segmentation for "unprocessed" XLIFF files. Where an XLIFF file has been created by a filter, where no segmentation has been applied to the individual <source> elements the XLIFF file can be considered as a normal XML file where the <target> elements constitute text that may be segmented. The XLIFF file target elements, which at the time contain the source text, can have segmentation information added by means of a segmentation namespace such as xml:tm using SRX rules. A normal XML XLIFF extraction can then be executed on the file using either an XSLT transformation, or program. The resultant skeleton file will enable the translated text to be merged with the original XLIFF document. An XSLT transformation can then be used to strip out the segmentation namespace, resulting in a "translated" original unsegmented XLIFF file. This solution is ideal for a production process that can handle pipeline transformations and where the XLIFF document constitutes raw, unprocessed extracted text. This solution is not necessarily suited to interactive segmentation that is being executed within a user interface centered environment, nor where the XLIFF file has already had some form of translation memory matching applied to it. The XLIFF segmentation sub-committee will continue trying to reach a solution for these types of environment. Best Regards, AZ -- email - email@example.com smail - c/o Mr. A.Zydron PO Box 2167 Gerrards Cross Bucks SL9 8XF United Kingdom Mobile +(44) 7966 477 181 FAX +(44) 1753 480 465 www - http://www.xml-intl.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you may not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. Unless explicitly stated otherwise this message is provided for informational purposes only and should not be construed as a solicitation or offer.