OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] Simplified XLIFF element tree


Hi,

Please find attached a modified tree including the new element that David mentions below.

In a separate message I provided examples of unsegmented and segmented XLIFF fragments using the <extr-text> element.

The separation between segmented/unsegmented files or fragments is much clearer in this way.

Regards,
Rodolfo
--
Rodolfo M. Raya   <rmraya@maxprograms.com>
Maxprograms      http://www.maxprograms.com


> -----Original Message-----
> From: David Filip [mailto:DavidF@MoraviaWorldWide.com]
> Sent: Monday, August 23, 2010 10:55 AM
> To: Asgeir Frimannsson; xliff
> Subject: RE: [xliff] Simplified XLIFF element tree
> 
> Rodolfo, Yves, Asgeir & al.
> I like the idea of separating extraction and translation units. It is a perfect
> radical idea for xliff minimal/modular brainstorming
> 
> Asgeir is right that we should not just think how to solve things with the
> present vocabulary. We should be ready for radical changes where we can
> get significant improvements in expressivity..
> 
> The merging of segments via <group> and its Boolean "merge" attribute has
> always been just an ugly workaround, useful though where the extractor
> introduced wrong segmentation.. But this is exactly the point here..
> 
> If you introduce an additional layer, you can sort out segmentation issues
> between <extr-unit> and <trans-unit> (including merging via attributes
> based on whatever reason)
> <file>+..<group>+ <extr-unit>+ <trans-unit>*
> This would be perfectly clean no <mrk> spanning..
> 
> This radical change would have radical corollaries though, such as moving
> <trans-unit> into an ad-on module, the core tree would only have <extr-
> unit>, since <trans-unit> would be enhanced with all sorts of translation
> metadata that could not possibly be considered core. <alt-trans> would be,
> i.e. obligatory in the translation-module..
> 
> I think that the advantages in complex content lifecycle workflows would
> prevail, i.e. the ugly "mid" would become obsolete..
> 
> Rgds
> dF
> 
> 
> David Filip
> Director, Research
> ==============================
> www.moraviaworldwide.com
> Phone:    + 420-545-552-203
> Fax:         + 420-545-552-233
> Mobile:   + 420-731-492244
> E-mail:     davidf@moraviaworldwide.com
> ==============================
> 
> Děkujeme, že zvažujete dopad tisku emailů na životní prostředí./ Thank you
> for considering the environmental impact of printing emails.
> 
> 
> -----Original Message-----
> From: Asgeir Frimannsson [mailto:asgeirf@redhat.com]
> Sent: Monday, August 23, 2010 3:26 PM
> To: xliff
> Subject: Re: [xliff] Simplified XLIFF element tree
> 
> Hi Rodolfo,
> 
> I think this is a fair approach to accomplish what you need within the
> constraints of the current versions of XLIFF. However, how could we be able
> to adopt this approach within a workflow where segmentation does not
> happen in the extraction-process? Am I right in assuming that this approach
> only works when segmentation happens in the extraction process?
> 
> cheers,
> asgeir
> 
> ----- "Rodolfo M. Raya" <rmraya@maxprograms.com> wrote:
> > Hi again,
> >
> > There are two different issues to consider:
> >
> > 1) How to represent a single segment.
> > 2) How to represent segmentation information in an XLIFF file.
> >
> > I propose to use <trans-unit> with a <source>/<target> pair to
> > represent a single segment (issue 1).
> >
> > Before analyzing issue 2, we need to define some basic concepts and
> > use cases. I would call "text block" to a portion of extracted
> > translatable text that can be split into two or more "segments".
> >
> > As Andrzej suggested, a "text block" can be represented using the
> > existing <group> element. Each component "segment" can be stored in
> > its own <trans-unit>. He provided an example that clearly shows how to
> > do that.
> >
> > The problematic use cases requiring segmentation information that I'm
> > aware of are:
> >
> > a) Wrong segmentation at text extraction time that needs to be fixed
> > at translation time (an unrecognized abbreviation for example).
> >
> > b) Translation of "m" segments that requires "n" segments in target
> > language.
> >
> > Both cases can be properly handled by allowing translators to merge
> > and split <trans-unit> elements within a given <group> as needed. This
> > can be done today using <group> and <trans-unit> and their existing
> > attributes. I've implemented this mechanism several times and I know
> > it works very well with XLIFF 1.0, 1.1 and 1.2.
> >
> > We need to define in XLIFF 2.0 the official way in which segments can
> > be merged or split at translation time.
> >
> > Regards,
> > Rodolfo
> > --
> > Rodolfo M. Raya   <rmraya@maxprograms.com>
> > Maxprograms      http://www.maxprograms.com
> >
> >
> > > -----Original Message-----
> > > From: Yves Savourel [mailto:ysavourel@translate.com]
> > > Sent: Monday, August 23, 2010 9:05 AM
> > > To: 'xliff'
> > > Subject: RE: [xliff] Simplified XLIFF element tree
> > >
> > > >...You can segment the <para> or <p> at text
> > > > extraction time and put each segment in its own <trans-unit>.
> > >
> > > I agree with Asgeir: extracting and segmenting should be two
> > distinct
> > > operations. While they can be done transparently at the same time
> > for the
> > > user, I think it's important to make a distinction between the
> > representation
> > > of the extracted unit and the segments.
> > >
> > >
> > > >...If you use a spanning mechanism inside source, you will
> > > > have multiple segments in source and target and the number
> > > > of source fragments may not match the number of target
> > > > fragments; that's very bad for TM/MT support and not XSLT
> > > > friendly at all.
> > >
> > > I agree with Rodolfo: there are some drawbacks with using spans:
> > order,
> > > number of segments, etc. But those issues are maybe a product of
> > > segmentation-related processes we will always have. For example an
> > > automated tool can create a tentative alignment with n-to-m cases
> > and
> > > provide the result in XLIFF for a user to finish/correct the aligned
> > set.
> > >
> > > Maybe there are other representations we can have other than using
> > <trans-
> > > unit> or using <seg-source> that would allow a more seamless
> > tracking of
> > > segments. We need to imagine it.
> > >
> > > -ys
> > >
> > >
> > >
> > ---------------------------------------------------------------------
> > > To unsubscribe from this mail list, you must leave the OASIS TC
> > that
> > > generates this mail.  Follow this link to all your TCs in OASIS at:
> > > https://www.oasis-
> > > open.org/apps/org/workgroup/portal/my_workgroups.php
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe from this mail list, you must leave the OASIS TC that
> > generates this mail.  Follow this link to all your TCs in OASIS at:
> > https://www.oasis-
> open.org/apps/org/workgroup/portal/my_workgroups.php
> 
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-
> open.org/apps/org/workgroup/portal/my_workgroups.php


<xliff version1 >1
|
+--- <file original1 source-language1 datatype1 >+
     |
     +--- <body>1
          |
          +--- <extr-text id1 resname? restype? segmented?>*
          |
          +--- <group id1 resname? restype? >*
          |    |
          |    +--- [trans-unit]*
          |
          +--- <trans-unit id1 resname? restype? >*
               |
               +--- <source >1
               |    |
               |    +--- [inline markup]*
               |
               +--- <target >?
                    |
                    +--- [inline markup]*



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]