[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [xliff] internal matches
yes, thanks! I had been thinking about using a custom <mrk>, and it would work (by adding match quality in the value attribute), but it is – custom and hence is not easily understood across tools of multiple parties.
I know that translation candidates are optional either, but at least if they are understood they should be understood in a common manner. So I guess we really missed an important use case for them.
Internal matches cannot be neglected for their impact on translation cost, and to support them we now either need to implement a custom annotation or implement a database and fuzzy matching engine in translation clients. Which renders translation candidates in the XLIFF superfluous, as we actually need to merge them then into the same database, and in that case we would have run better with TMX embedded in the XLIFF file header (because it would avoid duplicates) and not associated with single segments/units.
I do not want to dramatize actually – I am only thinking that we missed by a hair the chance to have a match proposal mechanism that works without a dynamic database of some kind.
Am I the only one worrying?
Yes, you are correct, I there is no official ways to link a content to another content that is the same or very similar (a duplicate or a fuzzy duplicate).
You could probably define some kind of annotation for this:
An mrk element spanning the duplicated/repetition content with ‘ref’ pointing to the original, and possibly ‘value’ as some indicator of the type of match (exact/fuzzy). If there is a need for more info one would have to define a module for that.
I wonder if we have overlooked a use case in the translation candidates module.
As you know, with XLIFF 2.0, it is easily possible to include reference data into the translatable file, such as matches and glossary data. The agent working on these data does not need to perform searches or comparison on the content or reference data, as all reference data can be linked to specific portions of the content data.
However, for reference data which is only to become created during the modification process I do not see currently a method to link.
Think of content data in one segment which, after translation, is a reasonable translation candidate in another segment. This relationship is easy to detect in the XLIFF creation or enricher phase.
But because this relationship cannot be expressed properly by reference mechanisms, one still needed to include e.g. fuzzy matching logic into the translation agent.
That is, if I did not overlook something. Did I?
Senior Technical Director Language Technology,
1240 Route des Dolines
06560 Sophia Antipolis