[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xliff] Simplified XLIFF element tree
Hi, Of course many times a <para> in DockBook or <p> in HTML may contain many sentences and you would want many segments. You can segment the <para> or <p> at text extraction time and put each segment in its own <trans-unit>. Adding spanning tags in <source> to indicate segmentation is a very bad idea. Each segment must have its own <source>/<target> pair that can be exported later to create a translation memory or to feed machine translation systems. If you use a spanning mechanism inside source, you will have multiple segments in source and target and the number of source fragments may not match the number of target fragments; that's very bad for TM/MT support and not XSLT friendly at all. Regards, Rodolfo -- Rodolfo M. Raya <rmraya@maxprograms.com> Maxprograms http://www.maxprograms.com > -----Original Message----- > From: Asgeir Frimannsson [mailto:asgeirf@redhat.com] > Sent: Monday, August 23, 2010 8:21 AM > To: xliff > Subject: Re: [xliff] Simplified XLIFF element tree > > Hi all, > > My initial thoughts on the subject of segmentation: > > 1) There are going to be times when the "extraction unit" (e.g. a docbook > <para> or a html <p>) is not always at the segment level. > 2) The segmentation process should typically belong to the 'translation > domain', not the 'extraction domain', although implementors may choose to > add segmentation in the extraction process. This is particularly important > where there is a n-to-m relationship between segments in source and > translation. > 3) <seg-source> is not an ideal solution. Annotating segments using <mrk> in > <source>, or introducing some other span-annotation mechanism is my > preferred solution. > > Perhaps what is lacking in the standard is a clear way to maintain "extraction > units" vs "translation units". In pre 2.0, the <trans-unit> is typically referred > to as the unit of extraction, but perhaps there is a case for a finer-grained > trans-unit that is a sub-set of an extraction unit, which can maintain it's own > state, annotations, TM/MT suggestions, etc? > > > cheers, > asgeir > > ----- "Andrzej Zydron" <azydron@xtm-intl.com> wrote: > > > Hi Everyone, > > > > I completely agree. We always pre-segment so out <trans-unit> elements > > > > each hold a segment. We use the <group> element to signify the higher > > > > level at which segmentation has taken place, e.g.: > > > > <group id="4"> > > <trans-unit id="t5" resname="p" translate="yes" xml:space="default"> > > <source>When a user logs out of the <g > > id="i2">XTM</g> Client, the Client clears > > > > the <g > > id="i3">UserName</g> and <g > > id="i4">Password</g> property of the application.</source> > > <target>When a user logs out of the <g > > id="i2">XTM</g> Client, the Client clears > > > > the <g > > id="i3">UserName</g> and <g > > id="i4">Password</g> property of the application.</target> > > </trans-unit> > > <trans-unit id="t6" resname="p" translate="yes" xml:space="default"> > > <source>The application will respond to the associated <g > > id="i5">PasswordChange</g> event by > > checking the values of the <g > > id="i6">UserName</g> and <g > > id="i7">Password</g>.</source> > > <target>The application will respond to the associated <g > > id="i5">PasswordChange</g> event by > > checking the values of the <g > > id="i6">UserName</g> and <g > > id="i7">Password</g>.</target> > > </trans-unit> > > <trans-unit id="t7" resname="p" translate="yes" xml:space="default"> > > <source>If these are empty, the application will log out.</source> > > <target>If these are empty, the application will log out.</target> > > </trans-unit> > > </group> > > > > Best Regards, > > > > AZ > > > > On 23/08/2010 10:49, Rodolfo M. Raya wrote: > > > Hi, > > > > > > I like the idea of further simplification. The elements that Yves > > removed can be left away. > > > > > > Segmentation information was optional in XLIFF 1.2 and will continue > > to be optional. The<part> element added by Yves should not be part of > > the basic tree. And, as I see it, any segmentation info will never be > > inside<source> or<target>, it will be at a higher level, preferably > > outside<trans-unit>. > > > > > > Regards, > > > Rodolfo > > > -- > > > Rodolfo M. Raya<rmraya@maxprograms.com> > > > Maxprograms http://www.maxprograms.com > > > > > >> -----Original Message----- > > >> From: Yves Savourel [mailto:ysavourel@translate.com] > > >> Sent: Monday, August 23, 2010 12:15 AM > > >> To: xliff@lists.oasis-open.org > > >> Subject: RE: [xliff] Simplified XLIFF element tree > > >> > > >> Hi, > > >> > > >> For the core/minimal XLIFF I would go for an even simpler model > > than > > >> Rodolfo's: > > >> > > >> -- I think segmentation is too important to not be part of the core > > structure > > >> of XLIFF, and the representation as extra info like it is the case > > in 1.2 (because > > >> of the need to be backward compatible with 1.1) is not adequate. It > > does not > > >> mean a file must always be segmented, just that representing a > > segmented > > >> content must be simple and the processing of segmented vs > > un-segmented > > >> content will be seamless. For simplicity I've represented this > > as<part> in > > >> the tree below, but it would be whatever structure we would end up > > with. > > >> > > >> -- I wouldn't put all the alt-trans data in the core. It > > corresponds to specific > > >> features that are not core to represent extracted text. > > >> > > >> -- I would select only essential parts for the core, and therefore > > not include > > >> the skeleton since it's not something essential for the extracted > > data (i.e. > > >> one can merge without using the XLIFF<skel> data, and ,skel> is > > really tool- > > >> specific anyway). > > >> > > >> -- I would not reduce the XLIFF namespace to the core. Just declare > > as core a > > >> subset of element/attributes of the namespace. > > >> > > >> > > >> <xliff version1>1 > > >> | > > >> +---<file original1 source-language1 datatype1>+ > > >> | > > >> +---<body>1 > > >> | > > >> +---<group id1 resname? restype?>* > > >> | | > > >> | +--- [trans-unit]* > > >> | > > >> +---<trans-unit id1 resname? restype?>* > > >> | > > >> +---<source>1 > > >> | | > > >> | +---<part id?>+ > > >> | | > > >> | +--- [inline markup]* > > >> | > > >> +---<target>? > > >> | > > >> +---<part id?>+ > > >> | > > >> +--- [inline markup]* > > >> > > >> > > >> Cheers, > > >> -ys > > >> > > >> > > >> > > >> > > --------------------------------------------------------------------- > > >> To unsubscribe from this mail list, you must leave the OASIS TC > > that > > >> generates this mail. Follow this link to all your TCs in OASIS > > at: > > >> https://www.oasis- > > >> open.org/apps/org/workgroup/portal/my_workgroups.php > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe from this mail list, you must leave the OASIS TC > > that > > > generates this mail. Follow this link to all your TCs in OASIS at: > > > > > https://www.oasis- > open.org/apps/org/workgroup/portal/my_workgroups.php > > > > > > > -- > > email - azydron@xtm-intl.com > > smail - c/o Mr. A.Zydron > > PO Box 2167 > > Gerrards Cross > > Bucks SL9 8XF > > United Kingdom > > Mobile +(44) 7966 477 181 > > FAX +(44) 1753 480 465 > > www - http://www.xtm-intl.com > > > > This message contains confidential information and is intended only > > for > > the individual named. If you are not the named addressee you may not > > disseminate, distribute or copy this e-mail. Please notify the > > sender > > immediately by e-mail if you have received this e-mail by mistake and > > delete this e-mail from your system. > > E-mail transmission cannot be guaranteed to be secure or error-free > > as > > information could be intercepted, corrupted, lost, destroyed, arrive > > late or incomplete, or contain viruses. The sender therefore does > > not > > accept liability for any errors or omissions in the contents of this > > message which arise as a result of e-mail transmission. If > > verification > > is required please request a hard-copy version. Unless explicitly > > stated > > otherwise this message is provided for informational purposes only > > and > > will not be construed as a solicitation or offer. > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe from this mail list, you must leave the OASIS TC that > > generates this mail. Follow this link to all your TCs in OASIS at: > > https://www.oasis- > open.org/apps/org/workgroup/portal/my_workgroups.php > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis- > open.org/apps/org/workgroup/portal/my_workgroups.php
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]