OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff-seg message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff-seg] Segmentation sub committee proposal for main XLIFF TC


Hi Andrzej,

Thank you for your reply.

Regarding 3) imagine the following scenario:

There is a filter for a proprietary native file format that produces "raw" XLIFF. Accompanying the filter is a verification tool that "knows" how to validate this particular flavour of XLIFF to ensure that the content can be converted to a valid native file. The verification tool knows what the tags mean and how to deal with the skeleton content.
It is obviously desirable to be able to run the verification tool on the "converted" or "segmented" XLIFF files, but that will not be possible without first converting them back to the original XLIFF format. 
Even if the intermediate files are made available for the purpose of doing this, if the verification tool actually makes changes to the file (e.g. to fix potential problems) there is no simple way of converting the file back to its "segmented" form. (Such changes could even be to the skeleton content...)

If something is still unclear about this scenario, let's discuss it in the meeting today.

Cheers,
Magnus

-----Original Message-----
From: Andrzej Zydron [mailto:azydron@xml-intl.com] 
Sent: Tuesday, November 09, 2004 6:42 AM
To: Magnus Martikainen
Cc: xliff-seg@lists.oasis-open.org
Subject: Re: [xliff-seg] Segmentation sub committee proposal for main XLIFF TC

Hi Magnus,

Thank you for your reply:

Magnus Martikainen wrote:

> Hi Andrzej,
> 
> Thank you for your excellent work! This looks good to me.
> I have only a few simple comments:
> 
> *) We should be consistent in examples and text and use "yes" and "no" for all the suggested attribute values (the last couple of examples uses equivalent="false").

I agree - this was an oversight on my part. Let us use "yes" and "no" throughout.

> *) As the "equivalent" attribute can be applied to groups, should we consider renaming it to "equivalent-translation", just to make it clear that it explicitly refers to the translation, and not to any other properties that the group may span?

I have no objections - it is more explicit than just "equivalent",

> *) I believe under 3) it may also be important to mention the fact that this approach has a negative impact on the ability to use tools (e.g. verification tools) developed for the original XLIFF format.
>

The input and out files are totally XLIFF compliant. Any intermediate files are 
considered transient and:

1) Should not be visible to any other parts of the process.
2) Should not be used for any other purposes, other than enabling creation of 
the segmented version of the XLIFF document, and merging back again.

They have the same type of status as a skeleton file would have and as such have 
no negative impact on any processing.

I appreciate your comments,

AZ

> Let's discuss this in the meeting today.
> 
> Cheers,
> Magnus
> 
> -----Original Message-----
> From: Andrzej Zydron [mailto:azydron@xml-intl.com] 
> Sent: Saturday, October 30, 2004 2:21 PM
> To: xliff-seg@lists.oasis-open.org
> Subject: [xliff-seg] Segmentation sub committee proposal for main XLIFF TC
> 
> Hi Everyone,
> 
> The following revised proposal should (hopefully) encompass all of the feedback 
> information from Tuesday's meeting. I have made a few small changes. I have not 
> included the "segment-" prefix to the attributes, because on reflection they do 
> not necessarily refer to segmentation issues. In addition is appears that the 
> use of <group merged-translations="yes"> implies the use of equivalent="no" to 
> all child <trans-unit> elements:
> 
> The following proposals have been prepared by the XLIFF Segmentation 
> sub-committee for consideration by the main XLIFF Technical Committee:
> 
> 1) A new attribute should be introduced at the <group> and <trans-unit> element 
> levels that indicates that any child <target> element content is a direct 
> translation of the corresponding <source> element:
> 
> equivalent (default value "yes")
> 
> The default value for this attribute is "yes" and as such the attribute can be 
> omitted for all instances where the default applies. Should the attribute be set 
> to "no" this indicates that the translation in any child <target> element is not 
> a direct equivalent of the <source> and SHOULD NOT be loaded into translation 
> memory. This attribute allows any conforming systems to exclude any text items 
> from being loaded into a translation memory system if it has been indicated that 
> the target text is not a direct equivalent of the source text.
> 
> Example - fixed length fields have forced the translator to place non-equivalent 
> text against individual trans-units in order to display the text, but the 
> individual translations are not a direct equivalent of the source text:
> 
> <trans-unit id="t.1" equivalent="no">
>    <source>Constrained text for limited</source>
>    <target>Tekst angielski dla</target>
> </trans-unit>
> <trans-unit id="t.2" equivalent="no">
>    <source>display for English</source>
>    <target>ograniczonego pola</target>
> </trans-unit>
> 
> The translation meets the application requirements, but is not a direct 
> translation of the source and should not be loaded into a leveraged translation 
> memory database.
> 
> 2) A new attribute should be introduced for the <group> element that indicates 
> that the translation of the encompassed <trans-unit> elements must be treated as 
> a whole and not as individual elements:
> 
> merged-translations (no default value, not mandatory)
> 
> Example 1) - linguistically the translation only makes sense if the text within 
> the <group> element is taken as a whole:
> 
> <group merged-translations="yes">
>    <trans-unit id="1" equivalent="false">
>      <source>The text goes on,</source>
>      <target>Texten ga*r vidare</target>
>    </trans-unit>
>    <trans-unit id="2" equivalent="false">
>      <source>and on, and on, and on.</source>
>      <target>och vidare, och vidare.</target>
>    </trans-unit>
> </group>
> 
> Example 2) - incorrect segmentation requires that the translation be taken as a 
> whole:
> 
> <group merged-translations="yes">
>    <trans-unit id="t1" equivalent="false">
>      <source>The German acronym v.</source>
>      <target>Niemiecki skrót v. OT oznacza górną pozycję silnika.</target>
>    </trans-unit>
>    <trans-unit id="t2" equivalent="false">
>      <source>OT signifies the top dead center position for an engine.</source>
>      <target/>
>    </trans-unit>
> </group>
> 
> The use of the merged-translations="yes" attribute at the group level implies 
> that any child <trans-unit> elements should have the "equivalent" attribute set 
> to "no".
> 
> 3) Segmentation for "unprocessed" XLIFF files. Where an XLIFF file has been 
> created by a filter, where no segmentation has been applied to the individual 
> <source> elements the XLIFF file can be considered as a normal XML file where 
> the <target> elements constitute text that may be segmented.
> 
> The XLIFF file target elements, which at the time contain the source text, can 
> have segmentation information added by means of a segmentation namespace such as 
> xml:tm using SRX rules. A normal XML XLIFF extraction can then be executed on 
> the file using either an XSLT transformation, or program. The resultant skeleton 
> file will enable the translated text to be merged with the original XLIFF 
> document. An XSLT transformation can then be used to strip out the segmentation 
> namespace, resulting in a "translated" original unsegmented XLIFF file. This 
> solution is ideal for a production process that can handle pipeline 
> transformations and where the XLIFF document constitutes raw, unprocessed 
> extracted text.
> 
> This solution is not necessarily suited to interactive segmentation that is 
> being executed within a user interface centered environment, nor where the XLIFF 
> file has already had some form of translation memory matching applied to it. The 
> XLIFF segmentation sub-committee will continue trying to reach a solution for 
> these types of environment.
> 
> Best Regards,
> 
> AZ
> 

-- 


email - azydron@xml-intl.com
smail - c/o Mr. A.Zydron
	PO Box 2167
         Gerrards Cross
         Bucks SL9 8XF
	United Kingdom
Mobile +(44) 7966 477 181
FAX    +(44) 1753 480 465
www - http://www.xml-intl.com

This message contains confidential information and is intended only
for the individual named.  If you are not the named addressee you
may not disseminate, distribute or copy this e-mail.  Please
notify the sender immediately by e-mail if you have received this
e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or error-free
as information could be intercepted, corrupted, lost, destroyed,
arrive late or incomplete, or contain viruses.  The sender therefore
does not accept liability for any errors or omissions in the contents
of this message which arise as a result of e-mail transmission.  If
verification is required please request a hard-copy version. Unless
explicitly stated otherwise this message is provided for informational
purposes only and should not be construed as a solicitation or offer.







[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]