OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [xliff] internal matches

Hi all,


I am a bit unsure if we get all this done in a hurry. Particularly the discussions about the version integration that Yves mentions below.


Anyway, here a preliminary, for discussion, proposal to add the capability to point to internal matches to XLIFF 2.1



Translation Candidate Reference Annotation


This annotation can be used to mark up content with a reference to other content which can be used as a translation proposal, but where the translation is not yet known at the time of annotation.


This annotation can reference any source spans of content that are referencable via the

XLIFF Fragment Identification mechanism




• The id attribute is REQUIRED

• The type attribute is REQUIRED and set to mtc:imatch

• The ref attribute is REQUIRED and points to source content which can be used as translation candidate

• The value attribute is OPTIONAL and if used represents the similarity value for the translation proposal in the range from 0.0 to 100.0

• The translate attribute is OPTIONAL


For example:


<unit id="u1">

<segment id="s1">

<source>He is my friend.</source>




<unit id="u2">

<segment id="s1">

<source><mrk id="m1" type="mtc:imatch" ref="#u=u1/s1" value="100.0">He is my friend.</mrk></source>





As you see, there are at least two problems:


1)      Other than in the original concept of the matches module the internal matches have to cross unit boundaries

2)      To make the referenced content of an internal match re-segmentable, it would be best to mark it with <mrk> tags, too. In case there is a translation added to that reference, it needs <mrk> tags, too.


The question is if one should make it a requirement that the ref attribute always points to a <mrk> (to enable resegmentation of the referenced content without breaking the match).


Best regards,




From: Yves Savourel [mailto:ysavourel@enlaso.com]
Sent: Donnerstag, 13. November 2014 18:51
To: 'Dr. David Filip'; Schurig, Joachim
Cc: xliff@lists.oasis-open.org
Subject: RE: [xliff] internal matches


There are 5 more day until closure of the 2.1 features. So if this has to have any chance to make it someone needs to fill a proposal very soon.


Also, will we have anyone willing to implement it? (before January).


It would also bring an interesting first case for implementing/(or not) backward compatibility with modules:


-   Can a 2.1 document have 2.0 Translation candidates? (or both 2.1 and 2.1)?

-   Does the 2.1 core schema would have to include both Translation Candidates schemes?

-   Etc.


A lot of question for dealing with updated modules will need to be resolved (which is a good thing).







From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Dr. David Filip
Sent: Thursday, November 13, 2014 10:12 AM
To: Schurig, Joachim
Cc: xliff@lists.oasis-open.org
Subject: Re: [xliff] internal matches


I think that this custom annotation would be a natural extension to the mtc module, should not be too difficult to add.


Dr. David Filip


OASIS XLIFF TC Secretary, Editor, and Liaison Officer 


University of Limerick, Ireland

telephone: +353-6120-2781

cellphone: +353-86-0222-158

facsimile: +353-6120-2734


On Thu, Nov 13, 2014 at 12:12 PM, Schurig, Joachim <Joachim.Schurig@lionbridge.com> wrote:

Hi Yves,


yes, thanks! I had been thinking about using a custom <mrk>, and it would work (by adding match quality in the value attribute), but it is – custom and hence is not easily understood across tools of multiple parties.


I know that translation candidates are optional either, but at least if they are understood they should be understood in a common manner. So I guess we really missed an important use case for them.


Internal matches cannot be neglected for their impact on translation cost, and to support them we now either need to implement a custom annotation or implement a database and fuzzy matching engine in translation clients. Which renders translation candidates in the XLIFF superfluous, as we actually need to merge them then into the same database, and in that case we would have run better with TMX embedded in the XLIFF file header (because it would avoid duplicates) and not associated with single segments/units.


I do not want to dramatize actually – I am only thinking that we missed by a hair the chance to have a match proposal mechanism that works without a dynamic database of some kind.


Am I the only one worrying?


Best regards,



From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Yves Savourel
Sent: Montag, 10. November 2014 16:42
To: xliff@lists.oasis-open.org
Subject: RE: [xliff] internal matches


Hi Joachim,


Yes, you are correct, I there is no official ways to link a content to another content that is the same or very similar (a duplicate or a fuzzy duplicate).


You could probably define some kind of annotation for this:

An mrk element spanning the duplicated/repetition content with ‘ref’ pointing to the original, and possibly ‘value’ as some indicator of the type of match (exact/fuzzy). If there is a need for more info one would have to define a module for that.





From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Schurig, Joachim
Sent: Monday, November 10, 2014 5:56 AM
To: xliff@lists.oasis-open.org
Subject: [xliff] internal matches


Dear colleagues,


I wonder if we have overlooked a use case in the translation candidates module.


As you know, with XLIFF 2.0, it is easily possible to include reference data into the translatable file, such as matches and glossary data. The agent working on these data does not need to perform searches or comparison on the content or reference data, as all reference data can be linked to specific portions of the content data.


However, for reference data which is only to become created during the modification process I do not see currently a method to link.


Think of content data in one segment which, after translation, is a reasonable translation candidate in another segment. This relationship is easy to detect in the XLIFF creation or enricher phase.


But because this relationship cannot be expressed properly by reference mechanisms, one still needed to include e.g. fuzzy matching logic into the translation agent.


That is, if I did not overlook something. Did I?





Joachim Schurig
Senior Technical Director Language Technology,

Lionbridge Fellow


1240 Route des Dolines

06560 Sophia Antipolis




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]