OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] Simplified XLIFF element tree


Hi,

If you want to separate "extracted text" from "segmented text", you can use a new element to contain unsegmented extracted text and the traditional <trans-unit> to contain the final segments.

You could represent unsegmented XLIFF with something like:

<body>
  <extr-text id="block-1">Sentence 1. Sentence 2.</extr-text>
  <extr-text id="block-2">Sentence 3. Sentence 4.</extr-text>
</body>

And represent the segmented XLIFF with:

<body>
   <extr-text id="block-1" segmented="yes">Sentence 1. Sentence 2.</extr-text>
   <group id="block-1">
    <trans-unit id="block-1_seg-1">
      <source>Sentence 1.</source>
    </trans-unit>
    <trans-unit id="block-1_seg-2">
      <source>Sentence 2.</source>
    </trans-unit>
  </group>
   <extr-text id="block-1" segmented="yes">Sentence 1. Sentence 2.</extr-text>
   <group id="block-2">
    <trans-unit id="block-2_seg-1">
      <source>Sentence 3.</source>
    </trans-unit>
    <trans-unit id="block-2_seg-2">
      <source>Sentence 4.</source>
    </trans-unit>
  </group>
</body>

Tools that support XLIFF 1.0 and 1.1 can translate segmented files simply ignoring the new <extr-text> element. Notice that after segmentation has been done, the <extr-text> elements could be deleted; in my example I added an attribute to indicate that the text has been segmented.

Notice that in any case doing segmentation after the XLIFF has been created means preparing a new XLIFF document. 

Regards,
Rodolfo
--
Rodolfo M. Raya   <rmraya@maxprograms.com>
Maxprograms      http://www.maxprograms.com


> -----Original Message-----
> From: Yves Savourel [mailto:ysavourel@translate.com]
> Sent: Monday, August 23, 2010 10:48 AM
> To: 'xliff'
> Subject: RE: [xliff] Simplified XLIFF element tree
> 
> > As Andrzej suggested, a "text block" can be represented using
> > the existing <group> element. Each component "segment" can
> > be stored in its own <trans-unit>.
> 
> The problem with using <trans-unit> for segment is that it doesn't support
> the separation between extraction and segmentation by other tools.
> 
> For example, if tool ABC creates an XLIFF document with non-segmented
> entry looking like this:
> 
> <trans-unit id='id1'>
>  <source>First sentence. Second sentence.</source>
> </trans-unit>
> 
> The tool DEF, which perform the segmentation, has to re-create something
> like this:
> 
> <group id='id1'>
>  <trans-unit id='id1-seg1'>
>   <source>First sentence. </source>
>  </trans-unit>
>  <trans-unit id='id1-seg2'>
>   <source> Second sentence.</source>
>  </trans-unit>
> </group>
> 
> Basically it has to re-create another XLIFF document.
> 
> I do understand the shortcomings of the <seg-source> approach and why
> using <trans-unit> seems like a good solution in 1.2: it allows to get metadata
> attached to segments.
> 
> But we are thinking for 2.0 and I'm sure we can imagine something that
> allows us to make the distinction between an extraction unit and a segment.
> 
> -ys
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-
> open.org/apps/org/workgroup/portal/my_workgroups.php




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]