[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xliff] Simplified XLIFF element tree
Hi, If you want to separate "extracted text" from "segmented text", you can use a new element to contain unsegmented extracted text and the traditional <trans-unit> to contain the final segments. You could represent unsegmented XLIFF with something like: <body> <extr-text id="block-1">Sentence 1. Sentence 2.</extr-text> <extr-text id="block-2">Sentence 3. Sentence 4.</extr-text> </body> And represent the segmented XLIFF with: <body> <extr-text id="block-1" segmented="yes">Sentence 1. Sentence 2.</extr-text> <group id="block-1"> <trans-unit id="block-1_seg-1"> <source>Sentence 1.</source> </trans-unit> <trans-unit id="block-1_seg-2"> <source>Sentence 2.</source> </trans-unit> </group> <extr-text id="block-1" segmented="yes">Sentence 1. Sentence 2.</extr-text> <group id="block-2"> <trans-unit id="block-2_seg-1"> <source>Sentence 3.</source> </trans-unit> <trans-unit id="block-2_seg-2"> <source>Sentence 4.</source> </trans-unit> </group> </body> Tools that support XLIFF 1.0 and 1.1 can translate segmented files simply ignoring the new <extr-text> element. Notice that after segmentation has been done, the <extr-text> elements could be deleted; in my example I added an attribute to indicate that the text has been segmented. Notice that in any case doing segmentation after the XLIFF has been created means preparing a new XLIFF document. Regards, Rodolfo -- Rodolfo M. Raya <rmraya@maxprograms.com> Maxprograms http://www.maxprograms.com > -----Original Message----- > From: Yves Savourel [mailto:ysavourel@translate.com] > Sent: Monday, August 23, 2010 10:48 AM > To: 'xliff' > Subject: RE: [xliff] Simplified XLIFF element tree > > > As Andrzej suggested, a "text block" can be represented using > > the existing <group> element. Each component "segment" can > > be stored in its own <trans-unit>. > > The problem with using <trans-unit> for segment is that it doesn't support > the separation between extraction and segmentation by other tools. > > For example, if tool ABC creates an XLIFF document with non-segmented > entry looking like this: > > <trans-unit id='id1'> > <source>First sentence. Second sentence.</source> > </trans-unit> > > The tool DEF, which perform the segmentation, has to re-create something > like this: > > <group id='id1'> > <trans-unit id='id1-seg1'> > <source>First sentence. </source> > </trans-unit> > <trans-unit id='id1-seg2'> > <source> Second sentence.</source> > </trans-unit> > </group> > > Basically it has to re-create another XLIFF document. > > I do understand the shortcomings of the <seg-source> approach and why > using <trans-unit> seems like a good solution in 1.2: it allows to get metadata > attached to segments. > > But we are thinking for 2.0 and I'm sure we can imagine something that > allows us to make the distinction between an extraction unit and a segment. > > -ys > > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis- > open.org/apps/org/workgroup/portal/my_workgroups.php
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]