[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xliff] ref value in translation candidates module element <mtc:match>
-------- Original Message --------
Subject: RE: [xliff] ref value in translation candidates module element
<mtc:match>
From: "Yoshito Umaoka" <yoshito_umaoka@us.ibm.com>
Date: Tue, April 06, 2021 4:07 pm
To: "Mr. Rodolfo Raya" <rmraya@maxprograms.com>
Cc: xliff@lists.oasis-open.org
Thanks Rodolfo for your quick response.
>The specs says that "ref" attribute points to a span of text, that does not mean it must be a <mrk> element, I take it just means that it has to point to an element that contains the text that is being matched.
This is somewhat I initially thought and I thought it make sense to point <segment> element. However, I realized that the original intent was not to limit a text span of match can be in <target> element, I started thinking the spec really meant exact text span specified.
For for example:
<?xmlversion="1.0"encoding="UTF-8"?>
<xliffxmlns="urn:oasis:names:tc:xliff:document:2.0"version="2.0"srcLang="en"
xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0"xml:space="preserve"
xmlns:mtc="urn:oasis:names:tc:xliff:matches:2.0"trgLang="fr">
<fileid="f1">
<unitid="u1">
<mtc:matches>
<mtc:matchid="tc1"ref="#s1"similarity="65">
<source>Please check the output.</source>
<target>Veuillez vÃrifier la sortie.</target>
</mtc:match>
</mtc:matches>
<segmentid="s1">
<source>Please check the results.</source>
<target>Veuillez vÃrifier les rÃsultats.</target>
</segment>
</unit>
</file>
</xliff>
If we allow <segment> can be referenced from <mtc:match> in bi-lingual XLIFF, it becomes ambiguous whether match/similarity is coming from the <source> value or the <target> value.
I think typical translation service application using XLIFF would do translation memory look up for each <segment>. If I were author of the spec and if segment element can be used for target span of text, I would not put example found in 5.1.4 -
<unitid="1">
<mtc:matches>
<mtc:matchref="#m1">
<source>He is my friend.</source>
<target>Il est mon ami.</target>
</mtc:match>
<mtc:matchref="#m1">
<source>He is my best friend.</source>
<target>Il est mon meilleur ami.</target>
</mtc:match>
</mtc:matches>
<segment>
<source><mrkid="m1"type="mtc:match">He is my friend.</mrk></source>
</segment>
<segment>
<source>Yet, I barely see him.</source>
</segment>
</unit>
Use of <mrk> will allow us to annotate sub-string of source/target value. If you can use a segment ID for the reference, the use of segment ID looks more natural choice. But this example uses <mrk> to enclose entire span of text in <source> element.
I also carefully looked at the term "span of text" "spanning" etc. in the specification. I feel these terms are specifically used for text value (including inline elements). The segment element is a container of source/target element, so I feel segment element does not specify "span of text".
If the original intention was to allow segment ID reference, then the spec should state - "points to a span of text or a container..."
-Yoshito
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]