[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [xliff-seg] Segmentation sub committee proposal for main XLIFFTC
Hi Magnus, Thank you for your reply: Magnus Martikainen wrote: > Hi Andrzej, > > Thank you for your excellent work! This looks good to me. > I have only a few simple comments: > > *) We should be consistent in examples and text and use "yes" and "no" for all the suggested attribute values (the last couple of examples uses equivalent="false"). I agree - this was an oversight on my part. Let us use "yes" and "no" throughout. > *) As the "equivalent" attribute can be applied to groups, should we consider renaming it to "equivalent-translation", just to make it clear that it explicitly refers to the translation, and not to any other properties that the group may span? I have no objections - it is more explicit than just "equivalent", > *) I believe under 3) it may also be important to mention the fact that this approach has a negative impact on the ability to use tools (e.g. verification tools) developed for the original XLIFF format. > The input and out files are totally XLIFF compliant. Any intermediate files are considered transient and: 1) Should not be visible to any other parts of the process. 2) Should not be used for any other purposes, other than enabling creation of the segmented version of the XLIFF document, and merging back again. They have the same type of status as a skeleton file would have and as such have no negative impact on any processing. I appreciate your comments, AZ > Let's discuss this in the meeting today. > > Cheers, > Magnus > > -----Original Message----- > From: Andrzej Zydron [mailto:firstname.lastname@example.org] > Sent: Saturday, October 30, 2004 2:21 PM > To: email@example.com > Subject: [xliff-seg] Segmentation sub committee proposal for main XLIFF TC > > Hi Everyone, > > The following revised proposal should (hopefully) encompass all of the feedback > information from Tuesday's meeting. I have made a few small changes. I have not > included the "segment-" prefix to the attributes, because on reflection they do > not necessarily refer to segmentation issues. In addition is appears that the > use of <group merged-translations="yes"> implies the use of equivalent="no" to > all child <trans-unit> elements: > > The following proposals have been prepared by the XLIFF Segmentation > sub-committee for consideration by the main XLIFF Technical Committee: > > 1) A new attribute should be introduced at the <group> and <trans-unit> element > levels that indicates that any child <target> element content is a direct > translation of the corresponding <source> element: > > equivalent (default value "yes") > > The default value for this attribute is "yes" and as such the attribute can be > omitted for all instances where the default applies. Should the attribute be set > to "no" this indicates that the translation in any child <target> element is not > a direct equivalent of the <source> and SHOULD NOT be loaded into translation > memory. This attribute allows any conforming systems to exclude any text items > from being loaded into a translation memory system if it has been indicated that > the target text is not a direct equivalent of the source text. > > Example - fixed length fields have forced the translator to place non-equivalent > text against individual trans-units in order to display the text, but the > individual translations are not a direct equivalent of the source text: > > <trans-unit id="t.1" equivalent="no"> > <source>Constrained text for limited</source> > <target>Tekst angielski dla</target> > </trans-unit> > <trans-unit id="t.2" equivalent="no"> > <source>display for English</source> > <target>ograniczonego pola</target> > </trans-unit> > > The translation meets the application requirements, but is not a direct > translation of the source and should not be loaded into a leveraged translation > memory database. > > 2) A new attribute should be introduced for the <group> element that indicates > that the translation of the encompassed <trans-unit> elements must be treated as > a whole and not as individual elements: > > merged-translations (no default value, not mandatory) > > Example 1) - linguistically the translation only makes sense if the text within > the <group> element is taken as a whole: > > <group merged-translations="yes"> > <trans-unit id="1" equivalent="false"> > <source>The text goes on,</source> > <target>Texten ga*r vidare</target> > </trans-unit> > <trans-unit id="2" equivalent="false"> > <source>and on, and on, and on.</source> > <target>och vidare, och vidare.</target> > </trans-unit> > </group> > > Example 2) - incorrect segmentation requires that the translation be taken as a > whole: > > <group merged-translations="yes"> > <trans-unit id="t1" equivalent="false"> > <source>The German acronym v.</source> > <target>Niemiecki skrót v. OT oznacza górną pozycję silnika.</target> > </trans-unit> > <trans-unit id="t2" equivalent="false"> > <source>OT signifies the top dead center position for an engine.</source> > <target/> > </trans-unit> > </group> > > The use of the merged-translations="yes" attribute at the group level implies > that any child <trans-unit> elements should have the "equivalent" attribute set > to "no". > > 3) Segmentation for "unprocessed" XLIFF files. Where an XLIFF file has been > created by a filter, where no segmentation has been applied to the individual > <source> elements the XLIFF file can be considered as a normal XML file where > the <target> elements constitute text that may be segmented. > > The XLIFF file target elements, which at the time contain the source text, can > have segmentation information added by means of a segmentation namespace such as > xml:tm using SRX rules. A normal XML XLIFF extraction can then be executed on > the file using either an XSLT transformation, or program. The resultant skeleton > file will enable the translated text to be merged with the original XLIFF > document. An XSLT transformation can then be used to strip out the segmentation > namespace, resulting in a "translated" original unsegmented XLIFF file. This > solution is ideal for a production process that can handle pipeline > transformations and where the XLIFF document constitutes raw, unprocessed > extracted text. > > This solution is not necessarily suited to interactive segmentation that is > being executed within a user interface centered environment, nor where the XLIFF > file has already had some form of translation memory matching applied to it. The > XLIFF segmentation sub-committee will continue trying to reach a solution for > these types of environment. > > Best Regards, > > AZ > -- email - firstname.lastname@example.org smail - c/o Mr. A.Zydron PO Box 2167 Gerrards Cross Bucks SL9 8XF United Kingdom Mobile +(44) 7966 477 181 FAX +(44) 1753 480 465 www - http://www.xml-intl.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you may not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. Unless explicitly stated otherwise this message is provided for informational purposes only and should not be construed as a solicitation or offer.