OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: SV: [xliff-comment] Grouping translations across<trans-unit>elements


On Wed, 2006-07-26 at 11:59 -0400, Corneliusson, Fredrik wrote:
> Thanks Rodolfo, but I was not referring to sub segmentation but to the
> section "2.8 Grouping translations across <trans-unit> elements". and
> this example in particular:
> 
> 
> <group merged-trans="yes">
>  <trans-unit id="t1">
>  <source>The German acronym v.</source>
>  <target equiv-trans="no">Niemiecki skrót v. OT oznacza górna pozycje silnika.</target>
>  </trans-unit>
>  <trans-unit id="t2">
>  <source>OT signifies the top dead center position for an engine.</source>
>  <target equiv-trans="no"/>
>  </trans-unit>
>  </group>
> 
> My proposition would be to generate the XLIFF with all white spaces
> between trans-unit's with translatable content in separate trans-units
> this way:
> 
> <group merged-trans="yes">
>  <trans-unit id="t1">
>  <source>The German acronym v.</source>
>  <target equiv-trans="no">Niemiecki skrót v. OT oznacza górna pozycje silnika.</target>
>  </trans-unit>
>  <trans-unit id="t2" translate="no">
>  <source> </source>
>  <target equiv-trans="no"/>
>  </trans-unit>
>  <trans-unit id="t3">
>  <source>OT signifies the top dead center position for an engine.</source>
>  <target equiv-trans="no"/>
>  </trans-unit>
>  </group>
> 
> This way an XLIFF editor/TMX Export will be able to see the merged
> source as:
> "The German acronym v. OT signifies the top dead center position for
> an engine."
> instead of:
> "The German acronym v.OT signifies the top dead center position for an
> engine."
> 
> 
> Imagine for example that the white space was tabs or new lines and you
> can see the problems that could arise after the file is back
> converted.

Hi Fredrik,

XLIFF specification does not require removing spaces from the source
element. The example intends to show a segmentation problem (incorrectly
breaking the text after an abbreviation). A tool that breaks text after
an abbreviation should preserve the space somewhere, like the spaces
after each sentence in a paragraph should be preserved.

The example should probably include the initial space in the second
segment, but its omission doesn't change the idea of grouping segments.

The mechanism for joining two segments using <group> elements is not
new. It has been possible for a long time and even implemented in
commercial CAT tools. The inclusion or removal of the spaces that appear
between sentences depends on the criteria of the tool maker.

Keep in mind that when you export to TMX a segment that starts or ends
with a space, those spaces should be removed. TMX specifications
explicitly mentions that <seg> elements contains "Text data (without
leading or trailing white spaces characters)" at
http://www.lisa.org/standards/tmx/tmx.html#seg

FWIW, some CAT tools are currently able to automatically remove
initial/trailing spaces when exporting XLIFF to TMX and to add them back
when importing translations from TMX to XLIFF.

Best regards,
Rodolfo M Raya
Heartsome
--
The information in this e-mail is intended strictly for the addressee,
without prejudices, as a confidential document. Should it reach you, not
being the addressee, it is not to be made accessible to any other
unauthorised person or copied, distributed or disclosed to any other
third party as this would constitute an unlawful act under certain
circumstances, unless prior approval is given for its transmission. The
content of this e-mail is solely that of the sender and not necessarily
that of Heartsome.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]