OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] Simplified XLIFF element tree


Hi Asgeir,

The approach I described works when segmentation is done at text extraction time and also at a later stage, which could be a segmentation pass or translation time.

If you want to delay segmentation, you can create a <group> with the "text block" to be translated contained in a single <trans-unit> child. Later, when needed, that <trans-unit> can be split either automatically by some process or manually by the translator. As result, you will have a <group> with as many <trans-unit> elements inside as segments you need.

Once you have segmentation done using an automated procedure, translators can merge consecutive segments within a <group> as needed to accommodate "m-to-n". When two segments are merged, <source> and <target> of one of them is appended/prepended to the <source> and <target> of the other and then the segment is marked as untranslatable. 

Converting the translated XLIFF to original format is simple. Automatically merge all segments that were split after XLIFF creation time and proceed to do reverse conversion as usual.

Regards,
Rodolfo
--
Rodolfo M. Raya   <rmraya@maxprograms.com>
Maxprograms      http://www.maxprograms.com


> -----Original Message-----
> From: Asgeir Frimannsson [mailto:asgeirf@redhat.com]
> Sent: Monday, August 23, 2010 10:26 AM
> To: xliff
> Subject: Re: [xliff] Simplified XLIFF element tree
> 
> Hi Rodolfo,
> 
> I think this is a fair approach to accomplish what you need within the
> constraints of the current versions of XLIFF. However, how could we be able
> to adopt this approach within a workflow where segmentation does not
> happen in the extraction-process? Am I right in assuming that this approach
> only works when segmentation happens in the extraction process?
> 
> cheers,
> asgeir
> 
> ----- "Rodolfo M. Raya" <rmraya@maxprograms.com> wrote:
> > Hi again,
> >
> > There are two different issues to consider:
> >
> > 1) How to represent a single segment.
> > 2) How to represent segmentation information in an XLIFF file.
> >
> > I propose to use <trans-unit> with a <source>/<target> pair to
> > represent a single segment (issue 1).
> >
> > Before analyzing issue 2, we need to define some basic concepts and
> > use cases. I would call "text block" to a portion of extracted
> > translatable text that can be split into two or more "segments".
> >
> > As Andrzej suggested, a "text block" can be represented using the
> > existing <group> element. Each component "segment" can be stored in
> > its own <trans-unit>. He provided an example that clearly shows how to
> > do that.
> >
> > The problematic use cases requiring segmentation information that I'm
> > aware of are:
> >
> > a) Wrong segmentation at text extraction time that needs to be fixed
> > at translation time (an unrecognized abbreviation for example).
> >
> > b) Translation of "m" segments that requires "n" segments in target
> > language.
> >
> > Both cases can be properly handled by allowing translators to merge
> > and split <trans-unit> elements within a given <group> as needed. This
> > can be done today using <group> and <trans-unit> and their existing
> > attributes. I've implemented this mechanism several times and I know
> > it works very well with XLIFF 1.0, 1.1 and 1.2.
> >
> > We need to define in XLIFF 2.0 the official way in which segments can
> > be merged or split at translation time.
> >
> > Regards,
> > Rodolfo
> > --
> > Rodolfo M. Raya   <rmraya@maxprograms.com>
> > Maxprograms      http://www.maxprograms.com
> >
> >
> > > -----Original Message-----
> > > From: Yves Savourel [mailto:ysavourel@translate.com]
> > > Sent: Monday, August 23, 2010 9:05 AM
> > > To: 'xliff'
> > > Subject: RE: [xliff] Simplified XLIFF element tree
> > >
> > > >...You can segment the <para> or <p> at text
> > > > extraction time and put each segment in its own <trans-unit>.
> > >
> > > I agree with Asgeir: extracting and segmenting should be two
> > distinct
> > > operations. While they can be done transparently at the same time
> > for the
> > > user, I think it's important to make a distinction between the
> > representation
> > > of the extracted unit and the segments.
> > >
> > >
> > > >...If you use a spanning mechanism inside source, you will
> > > > have multiple segments in source and target and the number
> > > > of source fragments may not match the number of target
> > > > fragments; that's very bad for TM/MT support and not XSLT
> > > > friendly at all.
> > >
> > > I agree with Rodolfo: there are some drawbacks with using spans:
> > order,
> > > number of segments, etc. But those issues are maybe a product of
> > > segmentation-related processes we will always have. For example an
> > > automated tool can create a tentative alignment with n-to-m cases
> > and
> > > provide the result in XLIFF for a user to finish/correct the aligned
> > set.
> > >
> > > Maybe there are other representations we can have other than using
> > <trans-
> > > unit> or using <seg-source> that would allow a more seamless
> > tracking of
> > > segments. We need to imagine it.
> > >
> > > -ys
> > >
> > >
> > >
> > ---------------------------------------------------------------------
> > > To unsubscribe from this mail list, you must leave the OASIS TC
> > that
> > > generates this mail.  Follow this link to all your TCs in OASIS at:
> > > https://www.oasis-
> > > open.org/apps/org/workgroup/portal/my_workgroups.php
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe from this mail list, you must leave the OASIS TC that
> > generates this mail.  Follow this link to all your TCs in OASIS at:
> > https://www.oasis-
> open.org/apps/org/workgroup/portal/my_workgroups.php
> 
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-
> open.org/apps/org/workgroup/portal/my_workgroups.php




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]