OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] Translating XLIFF 1.2


Hi Yves,


I wonder why, if the use of <seg-source> is so important, it is not mandatory now. Let's make it mandatory in XLIFF 2.0, better yet, fix the situation in the errata.

We have to fix XLIFF 1.2 specification and change the definition of <source> and <target>. We have to explain that <target> is a container for the <mrk> elements that hold the actual translations. 

Currently the specs say that <target> contains the translation of <source> and from what you wrote, that's not the actual intention. We need to change the content model of <target> as it currently allows plain text and other inline elements beside <mrk>. It should only contain <mk> elements or inline elements considered untranslatable.

We need to clarify the definition of <source> too and indicate that it contains unsegmented text, that must be processed and copied into a <seg-source> element.

We must clarify the use of <seg-source> as its definition says that it "can" contain markup such as segmentation. It must say that the element is intended to contain the segmentation markup. The words "generally" and "tipically" must be removed from the definition.

The introduction of Segmentation section must be changed, as it currently says that it "may be important for the user agent to break down the content of the <source>". It should clearly state that such operation is required.

If the above mentioned changes are not done now, the specification document would not say what apparently the XLIFF TC wanted to indicate. That would be terrible because some developers like me would interpret that segmentation can happen when the XLIFF file is created and write code that is valid according to the schema and the specification document, but wrong according to the intended spirit of the standard.

Best regards,
Rodolfo 
--
Rodolfo M. Raya   <rmraya@maxprograms.com>
Maxprograms      http://www.maxprograms.com

> -----Original Message-----
> From: Yves Savourel [mailto:ysavourel@translate.com]
> Sent: Monday, February 22, 2010 3:33 PM
> To: xliff@lists.oasis-open.org
> Subject: RE: [xliff] Translating XLIFF 1.2
> 
> Hi Rodolfo,
> 
> > Segmentation is a process that can be done before
> > creating an XLIFF file. It can even be done on
> > the fly while the XLIFF file is being generated.
> 
> I see many different files and file formats every year, the cases where
> the format drives the segmentation are, in my experience, rather rare.
> In any case they can be supported with <seg-source> as well:
> 
> <source>My segment.</source>
> <seg-source><mrk mid='1' mtype='seg'>My segment.</mrk></seg-source>
> 
> In fact, it is very important that <seg-source> is used in those cases:
> That is the only interoperable way another tool can know those trans-
> units are representing single segments.
> 
> 
> Note also it is the opinion of the majority of the XLIFF TC that the
> segmentation happens mostly after extraction. Otherwise you would not
> have this in the 1.2 specification:
> 
> "It is important to note that the manipulation / segmentation of trans-
> unit elements is owned by the "translator" domain, not at the
> extraction filter domain. This means that segmentation will be
> performed by the editing tool or possibly an automated segmentation
> process."
> 
> 
> > If the text to translate is already segmented,
> > there is absolutely no need to use <seg-source> at all.
> 
> Au contraire: It's the perfect opportunity to use <seg-source>.
> 
> 
> > The sentences in a paragraph can easily be maintained
> > together using a <group> element to enclose the related
> > <trans-unit> elements. This also optional "segmentation model"
> > has been possible since XLIFF 1.0.
> 
> First, Let's be very clear on one thing: The first time XLIFF has
> addressed segmentation is in 1.2. So there cannot be an XLIFF-standard
> or even a traditional way to represent segmentation before 1.2. This
> also means 1.2 does not have to be backward compatible with any
> segmentation representation because none existed from the viewpoint of
> XLIFF.
> 
> I remember very well the first XLIFF meeting I was in, in Dublin,
> before XLIFF was even named XLIFF. Choosing a name for <trans-unit>
> prompted a discussion about segmentation and we decided that
> segmentation was not going to be addressed. A <source> element is
> broadly defined as "... unit of text that could be a paragraph, a
> title, a menu item, a caption, etc."
> 
> The segmentation representation was only addressed in 1.2, and the TC
> had a sub-committee set to work on it. The result is the <seg-source>
> model.
> 
> So if a specific tool choose that <trans-unit> represented the result
> of a segmentation was an implementer choice.
> 
> 
> Now, as for using <trans-unit>/<group> to represent a paragraph and its
> sentences:
> 
> While there is nothing precluding a tool to do this. It does not follow
> the 1.2 specification recommendation where we have: "...It is important
> to note that the manipulation / segmentation of trans-unit elements is
> owned by the "translator" domain, not at the extraction filter domain."
> 
> To me the groups and trans-units are created by the extraction tool.
> And I expect to see those back when merging. If another user-agent
> starts modifying the structure of the XLIFF document we are going to
> have merging problems.
> 
> Let say, a filter creates this:
> 
> <trans-unit id='1'>
>  <source>My segment 1. My segment 2.</source>
> </trans-unit>
> 
> Then a user-agent specialized in segmentation does this:
> 
> <group id='1'>
>  <trans-unit id='1-1'>
>   <source>My segment 1. </source>
>  </trans-unit>
> <trans-unit id='1-2'>
>   <source>My segment 2.</source>
>  </trans-unit>
> </group>
> 
> Then I open the result in an editor, translate it and the editor
> carefully re-create the XLIFF it got. I get this:
> 
> <group id='1'>
>  <trans-unit id='1-1'>
>   <source>My segment 1. </source>
>   <target>Mon segment 1. </target>
>  </trans-unit>
> <trans-unit id='1-2'>
>   <source>My segment 2.</source>
>   <target>Mon segment 2.</target>
>  </trans-unit>
> </group>
> 
> Then the merging tool is in trouble.
> 
> Using <trans-unit>/<group> for representing segmentation is very bad,
> in my opinion, because it precludes other tools to modify the
> segmentation easily.
> 
> I can understand that in some case, the <trans-unit> becomes a segment
> unit when the original file format dictates it. But, again, it is a
> rare case, and it is also supported by the <seg-source> model.
> 
> 
> Cheers,
> -ys
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]