Subject: RE: [xliff] ref value in translation candidates module element <mtc:match>
firstname.lastname@example.org>-------- Original Message --------
Subject: RE: [xliff] ref value in translation candidates module element
From: "Yoshito Umaoka" <email@example.com>
Date: Tue, April 06, 2021 4:07 pm
To: "Mr. Rodolfo Raya" <firstname.lastname@example.org>
Thanks Rodolfo for your quick response.
>The specs says that "ref" attribute points to a span of text, that does not mean it must be a <mrk> element, I take it just means that it has to point to an element that contains the text that is being matched.
This is somewhat I initially thought and I thought it make sense to point <segment> element. However, I realized that the original intent was not to limit a text span of match can be in <target> element, I started thinking the spec really meant exact text span specified.
For for example:
<source>Please check the output.</source>
<target>Veuillez vÃrifier la sortie.</target>
<source>Please check the results.</source>
<target>Veuillez vÃrifier les rÃsultats.</target>
If we allow <segment> can be referenced from <mtc:match> in bi-lingual XLIFF, it becomes ambiguous whether match/similarity is coming from the <source> value or the <target> value.
I think typical translation service application using XLIFF would do translation memory look up for each <segment>. If I were author of the spec and if segment element can be used for target span of text, I would not put example found in 5.1.4 -
<source>He is my friend.</source>
<target>Il est mon ami.</target>
<source>He is my best friend.</source>
<target>Il est mon meilleur ami.</target>
<source><mrkid="m1"type="mtc:match">He is my friend.</mrk></source>
<source>Yet, I barely see him.</source>
Use of <mrk> will allow us to annotate sub-string of source/target value. If you can use a segment ID for the reference, the use of segment ID looks more natural choice. But this example uses <mrk> to enclose entire span of text in <source> element.
I also carefully looked at the term "span of text" "spanning" etc. in the specification. I feel these terms are specifically used for text value (including inline elements). The segment element is a container of source/target element, so I feel segment element does not specify "span of text".
If the original intention was to allow segment ID reference, then the spec should state - "points to a span of text or a container..."