OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] Simplified XLIFF element tree



> -----Original Message-----
> From: Asgeir Frimannsson [mailto:asgeirf@redhat.com]
> Sent: Tuesday, August 24, 2010 12:37 PM
> To: xliff
> Subject: Re: [xliff] Simplified XLIFF element tree
> 
> 
> I am trying to understand how your approach would work, but find it very
> hard to come up with a way of working with 'optional' unsegmented content.
> I think we do agree that a <trans-unit> should hold the translation of a
> segment, and it should have access to the source-language segment.


It's very easy to segment text at the time of extraction. By doing this, you don't need unsegmented content in the XLIFF file. I've been doing for many years and I can tell you it works.

I do not agree that the <trans-unit> should have access to the unsegmented content. 
 
> What do concern me with my earlier mock-example is the verbosity of the
> model when working with content that is typically always a single segment.
> For instance:
> 
> <body>
>   ...
>   <ex-unit id='block1'>
>     <content xml:space='default'>
>       <m type='seg' id='seg1'>This is the first sentence.</m>
>     </content>
>     <trans-unit seg-id='seg1'>
>       <target>Første setning.</target>
>     </trans-unit>
>   </ex-unit>
>   <ex-unit id='block2'>
>     <content xml:space='default'>
>       <m type='seg' id='seg1'>This is the second sentence.</m>
>     </content>
>     <trans-unit seg-id='seg1'>
>       <target>Andre setning.</target>
>     </trans-unit>
>   </ex-unit>
>   ...
> </body>

The example above is ugly for me.  I don't need <ex-unit>, <content> or <type="seg"> things in my way.

All I need is simple <trans-unit> with simple <source> and <target>.

> In that sense, a model more similar to what we have today in trans-unit (but
> eliminating <seg-source>) would be easier, for instance:
> 
> extraction model:
> <trans-unit>
>   <source>
>     This is the first sentence. This is the second sentence.
>   </source>
> </trans-unit>
>
>
> after segmentation:
> 
> <trans-unit>
>   <source>
>     <seg id='seg1'>This is the first sentence.</seg>
>     <seg id='seg2'>This is the second sentence.</seg>
>   </source>
> </trans-unit>

This is horrible again. I don't like those two <seg> things inside source.
 
> after translation:
> <trans-unit>
>   <source>
>     <seg id='seg1'>This is the first sentence.</seg>
>     <seg id='seg2'>This is the second sentence.</seg>
>   </source>
>   <target>
>     <seg id='seg1'>Første setning.</seg>
>     <seg id='seg2'>Andre setning.</seg>
>   </target>
> </trans-unit>

Still horrible, there are two <seg> things inside <target>.

The really bad thing about the model above is that source text is not adjacent to the corresponding translation. The elements holding source text and the corresponding translation should be children of the same element.

Your segments should be much simpler, like in: 

 <trans-unit>
   <source>This is the first sentence.</source>
   <target>Første setning.</target>
</trans-unit>
<trans-unit>
     <source>This is the second sentence.</source>
     <target>Andre setning. </target>
 </trans-unit>

If you want, you can keep the unsegmented text somewhere else in the XLIFF file. A simple <extr-text> like this could be useful for your purposes:

      <extr-text>This is the first sentence. This is the second sentence.</extr-text>

You can also use spanning elements inside the unsegmented content to delimit segments, like in 

      <extr-text><seg>This is the first sentence.</seg> <seg>This is the second sentence.</seg></extr-text>

You can relate the unsegmented text to the corresponding segments using attributes if you want.

Regards,
Rodolfo
--
Rodolfo M. Raya   <rmraya@maxprograms.com>
Maxprograms      http://www.maxprograms.com




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]