[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xliff] Simplified XLIFF element tree
Hi, Please find attached a modified tree including the new element that David mentions below. In a separate message I provided examples of unsegmented and segmented XLIFF fragments using the <extr-text> element. The separation between segmented/unsegmented files or fragments is much clearer in this way. Regards, Rodolfo -- Rodolfo M. Raya <rmraya@maxprograms.com> Maxprograms http://www.maxprograms.com > -----Original Message----- > From: David Filip [mailto:DavidF@MoraviaWorldWide.com] > Sent: Monday, August 23, 2010 10:55 AM > To: Asgeir Frimannsson; xliff > Subject: RE: [xliff] Simplified XLIFF element tree > > Rodolfo, Yves, Asgeir & al. > I like the idea of separating extraction and translation units. It is a perfect > radical idea for xliff minimal/modular brainstorming > > Asgeir is right that we should not just think how to solve things with the > present vocabulary. We should be ready for radical changes where we can > get significant improvements in expressivity.. > > The merging of segments via <group> and its Boolean "merge" attribute has > always been just an ugly workaround, useful though where the extractor > introduced wrong segmentation.. But this is exactly the point here.. > > If you introduce an additional layer, you can sort out segmentation issues > between <extr-unit> and <trans-unit> (including merging via attributes > based on whatever reason) > <file>+..<group>+ <extr-unit>+ <trans-unit>* > This would be perfectly clean no <mrk> spanning.. > > This radical change would have radical corollaries though, such as moving > <trans-unit> into an ad-on module, the core tree would only have <extr- > unit>, since <trans-unit> would be enhanced with all sorts of translation > metadata that could not possibly be considered core. <alt-trans> would be, > i.e. obligatory in the translation-module.. > > I think that the advantages in complex content lifecycle workflows would > prevail, i.e. the ugly "mid" would become obsolete.. > > Rgds > dF > > > David Filip > Director, Research > ============================== > www.moraviaworldwide.com > Phone: + 420-545-552-203 > Fax: + 420-545-552-233 > Mobile: + 420-731-492244 > E-mail: davidf@moraviaworldwide.com > ============================== > > Děkujeme, že zvažujete dopad tisku emailů na životní prostředí./ Thank you > for considering the environmental impact of printing emails. > > > -----Original Message----- > From: Asgeir Frimannsson [mailto:asgeirf@redhat.com] > Sent: Monday, August 23, 2010 3:26 PM > To: xliff > Subject: Re: [xliff] Simplified XLIFF element tree > > Hi Rodolfo, > > I think this is a fair approach to accomplish what you need within the > constraints of the current versions of XLIFF. However, how could we be able > to adopt this approach within a workflow where segmentation does not > happen in the extraction-process? Am I right in assuming that this approach > only works when segmentation happens in the extraction process? > > cheers, > asgeir > > ----- "Rodolfo M. Raya" <rmraya@maxprograms.com> wrote: > > Hi again, > > > > There are two different issues to consider: > > > > 1) How to represent a single segment. > > 2) How to represent segmentation information in an XLIFF file. > > > > I propose to use <trans-unit> with a <source>/<target> pair to > > represent a single segment (issue 1). > > > > Before analyzing issue 2, we need to define some basic concepts and > > use cases. I would call "text block" to a portion of extracted > > translatable text that can be split into two or more "segments". > > > > As Andrzej suggested, a "text block" can be represented using the > > existing <group> element. Each component "segment" can be stored in > > its own <trans-unit>. He provided an example that clearly shows how to > > do that. > > > > The problematic use cases requiring segmentation information that I'm > > aware of are: > > > > a) Wrong segmentation at text extraction time that needs to be fixed > > at translation time (an unrecognized abbreviation for example). > > > > b) Translation of "m" segments that requires "n" segments in target > > language. > > > > Both cases can be properly handled by allowing translators to merge > > and split <trans-unit> elements within a given <group> as needed. This > > can be done today using <group> and <trans-unit> and their existing > > attributes. I've implemented this mechanism several times and I know > > it works very well with XLIFF 1.0, 1.1 and 1.2. > > > > We need to define in XLIFF 2.0 the official way in which segments can > > be merged or split at translation time. > > > > Regards, > > Rodolfo > > -- > > Rodolfo M. Raya <rmraya@maxprograms.com> > > Maxprograms http://www.maxprograms.com > > > > > > > -----Original Message----- > > > From: Yves Savourel [mailto:ysavourel@translate.com] > > > Sent: Monday, August 23, 2010 9:05 AM > > > To: 'xliff' > > > Subject: RE: [xliff] Simplified XLIFF element tree > > > > > > >...You can segment the <para> or <p> at text > > > > extraction time and put each segment in its own <trans-unit>. > > > > > > I agree with Asgeir: extracting and segmenting should be two > > distinct > > > operations. While they can be done transparently at the same time > > for the > > > user, I think it's important to make a distinction between the > > representation > > > of the extracted unit and the segments. > > > > > > > > > >...If you use a spanning mechanism inside source, you will > > > > have multiple segments in source and target and the number > > > > of source fragments may not match the number of target > > > > fragments; that's very bad for TM/MT support and not XSLT > > > > friendly at all. > > > > > > I agree with Rodolfo: there are some drawbacks with using spans: > > order, > > > number of segments, etc. But those issues are maybe a product of > > > segmentation-related processes we will always have. For example an > > > automated tool can create a tentative alignment with n-to-m cases > > and > > > provide the result in XLIFF for a user to finish/correct the aligned > > set. > > > > > > Maybe there are other representations we can have other than using > > <trans- > > > unit> or using <seg-source> that would allow a more seamless > > tracking of > > > segments. We need to imagine it. > > > > > > -ys > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe from this mail list, you must leave the OASIS TC > > that > > > generates this mail. Follow this link to all your TCs in OASIS at: > > > https://www.oasis- > > > open.org/apps/org/workgroup/portal/my_workgroups.php > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe from this mail list, you must leave the OASIS TC that > > generates this mail. Follow this link to all your TCs in OASIS at: > > https://www.oasis- > open.org/apps/org/workgroup/portal/my_workgroups.php > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis- > open.org/apps/org/workgroup/portal/my_workgroups.php
<xliff version1 >1 | +--- <file original1 source-language1 datatype1 >+ | +--- <body>1 | +--- <extr-text id1 resname? restype? segmented?>* | +--- <group id1 resname? restype? >* | | | +--- [trans-unit]* | +--- <trans-unit id1 resname? restype? >* | +--- <source >1 | | | +--- [inline markup]* | +--- <target >? | +--- [inline markup]*
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]