OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff-seg message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Applying <mrk> on examples


Hi everyone,

Here is my homework: trying to apply the <mrk> element to
segment the text of the example file.

Cheers,
-yves

Title: Applying segmentation using <mrk>

Applying segmentation using <mrk>

Notes:

If it is required to do an adjustment of <bpt>/<ept>, <bx>/<ex> -type tags, it should be done without knowledge of the enclosed code.

It may not be necessary to adjust paired tags broken by segmentation if not adjusting them does not result in invalid XML. Other translation tools for example do not try to adjust an inline <b>...</b> element that is broken by interactive segmentation.

 


TagExample1

<trans-unit id="TagExample1">
<!--Sloppy HTML starting with <B>bold. Fancy <I>bold italic text</B> in the middle. Only italic</I> last.-->
<!-- Version 1 (bpt and ept): how are matching <bpt> and <ept> handled if they need to go into different segments? -->
<source>Sloppy HTML starting with <bpt id="pt1" rid="pt1">&lt;B&gt;</bpt>bold. Fancy <bpt id="pt2" rid="pt2">&lt;I&gt;</bpt>bold italic text<ept id="pt1" rid="pt1">&lt;/B&gt;</ept> in the middle. Only italic<ept id="pt2" rid="pt2">&lt;/I&gt;</ept> last.</source>
</trans-unit>

<source><mrk mtype='x-seg' mid='1'>Sloppy HTML starting with <bpt id="pt1">&lt;B&gt;</bpt>bold.</mrk> 
<mrk mtype='x-seg' mid='2'>Fancy <bpt id="pt2">&lt;I&gt;</bpt>bold italic text<ept id="pt1">&lt;/B&gt;</ept> in the middle.</mrk> 
<mrk mtype='x-seg' mid='3'>Only italic<ept id="pt2">&lt;/I&gt;</ept> last.</mrk></source>

 


TagExample2

<trans-unit id="TagExample2">
<!--Sloppy HTML starting with <B>bold. Fancy <I>bold italic text</B> in the middle. Only italic</I> last.-->
<!-- Version 2 (bx and ex): how are matching <bx> and <ex> handled if they need to go into different segments? -->
<source>Sloppy HTML starting with <bx id="pt1" rid="pt1"/>bold. Fancy <bx id="pt2" rid="pt2"/>bold italic text<ex id="pt1" rid="pt1"/> in the middle. Only italic<ex id="pt2" rid="pt2"/> last.</source>
</trans-unit>

Codes in green are added to get paired codes in segments (I don't think we HAVE to do this).

<source><mrk mtype='x-seg' mid='1'>Sloppy HTML starting with <bx id="pt1"/>bold.<ex id="pt1"/></mrk> 
<mrk mtype='x-seg' mid='2'><bx id="pt1"/>Fancy <bx id="pt2"/>bold italic text<ex id="pt1"/> in the middle.<ex id="pt2"/></mrk> 
<mrk mtype='x-seg' mid='3'><bx id="pt2"/>Only italic<ex id="pt2"/> last.</mrk></source>

 


TagExample3

<trans-unit id="TagExample3">
<!--Sloppy HTML starting with <B>bold. Fancy <I>bold italic text</B> in the middle. Only italic</I> last
.-->
<!-- Version 3 (balanced g): how are <g> elements handled if a segment spans either the start or the end
tags, but not both? -->
<source>Sloppy HTML starting with <g id="g1">bold. Fancy </g><g id="g2">bold italic text</g><g id="
g3"> in the middle. Only italic</g> last.</source>
</trans-unit>

Codes in green are added to get paired codes in segment. This is required with <g>.

<source><mrk mtype='x-seg' mid='1'>Sloppy HTML starting with <g id="g1">bold.</g></mrk> 
<mrk mtype='x-seg' mid='2'><g id="g1">Fancy </g><g id="g2">bold italic text</g><g id="g3"> in the middle.</g></mrk> 
<mrk mtype='x-seg' mid='1'><g id="g3">Only italic</g> last.</source>

A thought: maybe we would need an attribute in <g> (and <bpt>/<ept>, etc.) to indicate that one of the tags of the element has been added during the segmentation? Not sure: just thinking aloud.

 


AltTransExample1

<trans-unit id="AltTransExample1">
<!-- how are <alt-trans> elements for the entire <trans-unit> handled if the <trans-unit> content is
split into multiple segments? -->
<source>This paragraph has two sentences. It illustrates alt-trans handling.</source>
<alt-trans match-quality="90%">
<source>This paragraph has two sentences. It almost illustrates alt-trans handling.</source>
<target>Det här stycket har två meningar. Det visar nästan hur alt-trans ska skötas.</
target>
</alt-trans>
</trans-unit>

The <mrk> elements can be added in the <alt-trans> too if needed.

<source><mrk mtype='x-seg' mid=1'>This paragraph has two sentences.</mrk> 
<mrk mtype='x-seg' mid='2'>It illustrates alt-trans handling.</mrk></source>
<alt-trans match-quality="90%">
<source><mrk mtype='x-seg' mid=1'>This paragraph has two sentences.</mrk> 
<mrk mtype='x-seg' mid=2'>It almost illustrates alt-trans handling.</mrk></source>
<target><mrk mtype='x-seg' mid=1'>Det här stycket har två meningar.</mrk> 
<mrk mtype='x-seg' mid=2'>Det visar nästan hur alt-trans ska skötas.</mrk></target>
</alt-trans>

But most likely I think would could also not have any segment in the <alt-trans> if the text comes from some other source than a TM (like the result of a leveraging).

 


AltTransExample2

<trans-unit id="AltTransExample2">
<!-- can single or multiple <alt-trans> elements be used to match single segments inside the <trans-
unit>, and if so can we show which part they match? -->
<source>This paragraph has two sentences. It illustrates alt-trans handling.</source>
<alt-trans match-quality="100% for first sentence">
<source>This paragraph has two sentences.</source>
<target>Det här stycket har två meningar.</target>
</alt-trans>
<alt-trans match-quality="85% for second sentence">
<source>It almost illustrates alt-trans handling.</source>
<target>Det visar nästan hur alt-trans ska skötas.</target>
</alt-trans>
</trans-unit>

 

<source><mrk mtype='x-seg' mid='1'>This paragraph has two sentences.</mrk> 
<mrk mtype='x-seg' mid='2'>It illustrates alt-trans handling.</mrk></source>
<alt-trans match-quality="100% for first sentence">
<source><mrk mtype='x-seg' mid='1'>This paragraph has two sentences.</mrk></source>
<target><mrk mtype='x-seg' mid='1'>Det här stycket har två meningar.</mrk></target>
</alt-trans>
<alt-trans match-quality="85% for second sentence">
<source><mrk mtype='x-seg' mid='2'>It almost illustrates alt-trans handling.</mrk></source>
<target><mrk mtype='x-seg' mid='2'>Det visar nästan hur alt-trans ska skötas.</mrk></target>
</alt-trans>

The same notes as for AltTransExample2 apply here: I'm not sure we want to have segment markers in the <alt-trans> if they are propositions. Yes if they are history of the translation (after edit for example), but for TM matches, it seems not necessary.

 

-end-

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]