OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] ref value in translation candidates module element <mtc:match>



Hi Yoshito,

I would differentiate two cases:

1) The match applies to the whole segment

2) The match applies to a fragment

If the match applies to the whole segment, pointing "ref" to the "id" of the segment makes sense. Keep in mind that both <segment> and <mtc:match> are containers. The concept of "span" doesn't make much sense here. As you say, we need to be more specific regarding the use of containers. The text is weak here.

If the match applies to a fragment, then pointing to the "id" of an inline element of any kind makes sense. It could be a <mrk> but it could also be something else. What is not clear enough is the location of the text span. Right now it could be anywhere inside the <unit>, so it might be in <target>,  <ignorable> or even a <gls:glossEntry>!!! 

This certainly needs to be clarified. 

Regards,
Rodolfo
--
Rodolfo M. Raya <rmraya@maxprograms.com>
Maxprograms http://www.maxprograms.com

-------- Original Message --------
Subject: RE: [xliff] ref value in translation candidates module element
<mtc:match>
From: "Yoshito Umaoka" <yoshito_umaoka@us.ibm.com>
Date: Tue, April 06, 2021 4:07 pm
To: "Mr. Rodolfo Raya" <rmraya@maxprograms.com>
Cc: xliff@lists.oasis-open.org

Thanks Rodolfo for your quick response.

>The specs says that "ref" attribute points to a span of text, that does not mean it must be a <mrk> element, I take it just means that it has to point to an element that contains the text that is being matched.

This is somewhat I initially thought and I thought it make sense to point <segment> element. However, I realized that the original intent was not to limit a text span of match can be in <target> element, I started thinking the spec really meant exact text span specified.

For for example:

<?xmlversion="1.0"encoding="UTF-8"?>
<xliffxmlns="urn:oasis:names:tc:xliff:document:2.0"version="2.0"srcLang="en"
xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0"xml:space="preserve"
xmlns:mtc="urn:oasis:names:tc:xliff:matches:2.0"trgLang="fr">
<fileid="f1">
<unitid="u1">
<mtc:matches>
<mtc:matchid="tc1"ref="#s1"similarity="65">
<source>Please check the output.</source>
<target>Veuillez vÃrifier la sortie.</target>
</mtc:match>
</mtc:matches>
<segmentid="s1">
<source>Please check the results.</source>
<target>Veuillez vÃrifier les rÃsultats.</target>
</segment>
</unit>
</file>
</xliff>

If we allow <segment> can be referenced from <mtc:match> in bi-lingual XLIFF, it becomes ambiguous whether match/similarity is coming from the <source> value or the <target> value.

I think typical translation service application using XLIFF would do translation memory look up for each <segment>. If I were author of the spec and if segment element can be used for target span of text, I would not put example found in 5.1.4 -

<unitid="1">
<mtc:matches>
<mtc:matchref="#m1">
<source>He is my friend.</source>
<target>Il est mon ami.</target>
</mtc:match>
<mtc:matchref="#m1">
<source>He is my best friend.</source>
<target>Il est mon meilleur ami.</target>
</mtc:match>
</mtc:matches>
<segment>
<source><mrkid="m1"type="mtc:match">He is my friend.</mrk></source>
</segment>
<segment>
<source>Yet, I barely see him.</source>
</segment>
</unit>

Use of <mrk> will allow us to annotate sub-string of source/target value. If you can use a segment ID for the reference, the use of segment ID looks more natural choice. But this example uses <mrk> to enclose entire span of text in <source> element.

I also carefully looked at the term "span of text" "spanning" etc. in the specification. I feel these terms are specifically used for text value (including inline elements). The segment element is a container of source/target element, so I feel segment element does not specify "span of text".

If the original intention was to allow segment ID reference, then the spec should state - "points to a span of text or a container..."

-Yoshito






[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]