OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [xliff] Simplified XLIFF element tree


  Hi Everyone,

I totally agree with Rodolfo. For systems that have chosen to 
pre-segment, there should not be any compulsion to do otherwise. I 
always viewed the the original <seg-source> proposal with a great deal 
of regret. In my mind it was a very bad choice in XLIFF 1.2, and has 
proven to be so.

Interestingly, the main achievement of XLIFF to date, in our experience, 
is to put localizable text in a common XML format. We currently treat 
'foreign' XLIFF files as a special variant of XML for which we have ITS 
rules. We extract the text and matching into our XLIFF 1.2 subset and on 
completion of translation we merge the translated text back into the 
original 'foreign' format. This way we achieve complete interoperability 
without having to try and cope with the myriad possible variations of 
XLIFF natively.

Best Regards,

AZ

On 24/08/2010 19:46, Rodolfo M. Raya wrote:
>> -----Original Message-----
>> From: Asgeir Frimannsson [mailto:asgeirf@redhat.com]
>> Sent: Tuesday, August 24, 2010 12:37 PM
>> To: xliff
>> Subject: Re: [xliff] Simplified XLIFF element tree
>>
>>
>> I am trying to understand how your approach would work, but find it very
>> hard to come up with a way of working with 'optional' unsegmented content.
>> I think we do agree that a<trans-unit>  should hold the translation of a
>> segment, and it should have access to the source-language segment.
>
> It's very easy to segment text at the time of extraction. By doing this, you don't need unsegmented content in the XLIFF file. I've been doing for many years and I can tell you it works.
>
> I do not agree that the<trans-unit>  should have access to the unsegmented content.
>
>> What do concern me with my earlier mock-example is the verbosity of the
>> model when working with content that is typically always a single segment.
>> For instance:
>>
>> <body>
>>    ...
>>    <ex-unit id='block1'>
>>      <content xml:space='default'>
>>        <m type='seg' id='seg1'>This is the first sentence.</m>
>>      </content>
>>      <trans-unit seg-id='seg1'>
>>        <target>Første setning.</target>
>>      </trans-unit>
>>    </ex-unit>
>>    <ex-unit id='block2'>
>>      <content xml:space='default'>
>>        <m type='seg' id='seg1'>This is the second sentence.</m>
>>      </content>
>>      <trans-unit seg-id='seg1'>
>>        <target>Andre setning.</target>
>>      </trans-unit>
>>    </ex-unit>
>>    ...
>> </body>
> The example above is ugly for me.  I don't need<ex-unit>,<content>  or<type="seg">  things in my way.
>
> All I need is simple<trans-unit>  with simple<source>  and<target>.
>
>> In that sense, a model more similar to what we have today in trans-unit (but
>> eliminating<seg-source>) would be easier, for instance:
>>
>> extraction model:
>> <trans-unit>
>>    <source>
>>      This is the first sentence. This is the second sentence.
>>    </source>
>> </trans-unit>
>>
>>
>> after segmentation:
>>
>> <trans-unit>
>>    <source>
>>      <seg id='seg1'>This is the first sentence.</seg>
>>      <seg id='seg2'>This is the second sentence.</seg>
>>    </source>
>> </trans-unit>
> This is horrible again. I don't like those two<seg>  things inside source.
>
>> after translation:
>> <trans-unit>
>>    <source>
>>      <seg id='seg1'>This is the first sentence.</seg>
>>      <seg id='seg2'>This is the second sentence.</seg>
>>    </source>
>>    <target>
>>      <seg id='seg1'>Første setning.</seg>
>>      <seg id='seg2'>Andre setning.</seg>
>>    </target>
>> </trans-unit>
> Still horrible, there are two<seg>  things inside<target>.
>
> The really bad thing about the model above is that source text is not adjacent to the corresponding translation. The elements holding source text and the corresponding translation should be children of the same element.
>
> Your segments should be much simpler, like in:
>
>   <trans-unit>
>     <source>This is the first sentence.</source>
>     <target>Første setning.</target>
> </trans-unit>
> <trans-unit>
>       <source>This is the second sentence.</source>
>       <target>Andre setning.</target>
>   </trans-unit>
>
> If you want, you can keep the unsegmented text somewhere else in the XLIFF file. A simple<extr-text>  like this could be useful for your purposes:
>
>        <extr-text>This is the first sentence. This is the second sentence.</extr-text>
>
> You can also use spanning elements inside the unsegmented content to delimit segments, like in
>
>        <extr-text><seg>This is the first sentence.</seg>  <seg>This is the second sentence.</seg></extr-text>
>
> You can relate the unsegmented text to the corresponding segments using attributes if you want.
>
> Regards,
> Rodolfo
> --
> Rodolfo M. Raya<rmraya@maxprograms.com>
> Maxprograms      http://www.maxprograms.com
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>

-- 
email - azydron@xtm-intl.com
smail - c/o Mr. A.Zydron
	PO Box 2167
         Gerrards Cross
         Bucks SL9 8XF
	United Kingdom
Mobile +(44) 7966 477 181
FAX    +(44) 1753 480 465
www - http://www.xtm-intl.com

This message contains confidential information and is intended only for
the individual named.  If you are not the named addressee you may not
disseminate, distribute or copy this e-mail.  Please notify the sender
immediately by e-mail if you have received this e-mail by mistake and
delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or error-free as
information could be intercepted, corrupted, lost, destroyed, arrive
late or incomplete, or contain viruses.  The sender therefore does not
accept liability for any errors or omissions in the contents of this
message which arise as a result of e-mail transmission.  If verification
is required please request a hard-copy version. Unless explicitly stated
otherwise this message is provided for informational purposes only and
should not be construed as a solicitation or offer.


begin:vcard
fn;quoted-printable:Andrzej Zydro=C5=84
n;quoted-printable:Zydro=C5=84;Andrzej
email;internet:azydron@xml-intl.com
tel;work:+441494558106
tel;home:+441494532343
tel;cell:+447966477181
x-mozilla-html:FALSE
version:2.1
end:vcard



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]