OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] XLIFF 2.0 example files for segmentation


Hi David,
 
Your example has a terrible problem. According to it, <source> can be a child of both <unit> and <segment>. This means that <source> and <target> would be jumping around from <unit> to <segment> and perhaps we could end with a duplicated <source> (one version inside <unit> and the other in a <segment>).
I think it is a very bad idea.
 
Regards,
Rodolfo
--
Rodolfo M. Raya <RMRAYA@MAXPROGRAMS.COM>
Maxprograms http://www.maxprograms.com
 
 
-------- Original Message --------
Subject: [xliff] XLIFF 2.0 example files for segmentation
From: David Walters <waltersd@us.ibm.com>
Date: Wed, November 09, 2011 5:52 pm
To: xliff@lists.oasis-open.org

It is easier for me to understand the situation if I have an example to reference.

Here is a simple Java PropertyResourceBundle file to be used as the original source file.


    string1=First sentence.\nSecond sentence.\nThird\nsentence.
    string2=The user <b>{0}</b> deleted file {1}.  File {1} cannot be recovered.


A product developer might create an extraction program to create this XLIFF 2.0 file.
    Note: I included a "segment" attribute to document how the <source> text was "segmented".

    Without <segment>.
      <?xml version="1.0" encoding="utf-8"?>
      <xliff version="2.0" segment="block">
        <file srclang="en" original="test.properties">
          <unit id="string1">
            <source>First sentence.
      Second sentence.
      Third
      sentence.</source>
          </unit>
          <unit id="string2">
            <source>The user <pc id="1><ph id="2"/></pc> deleted file <ph id="3"/>.  File <ph id="3"/> cannot be recovered.</source>
          </unit>
        </file>
      </xliff>

    With <segment>.
      <?xml version="1.0" encoding="utf-8"?>
      <xliff version="2.0" segment="block">
        <file srclang="en" original="test.properties">
          <unit id="string1">
            <segment id="1">
              <source>First sentence.
        Second sentence.
        Third
        sentence.</source>
            </segment>
          </unit>
          <unit id="string2">
            <segment id="1">
              <source>The user <pc id="1><ph id="2"/></pc> deleted file <ph id="3"/>.  File <ph id="3"/> cannot be recovered.</source>
            </segment>
          </unit>
        </file>
      </xliff>


A translation tool may modify the file to segment the text based on sentences.  The translated file might be the following:

    <?xml version="1.0" encoding="utf-8"?>
    <xliff version="2.0" segment="sentence">
      <file srclang="en" tgtlang="es" original="test.properties">
        <unit id="string1">
          <segment id="1">
            <source>First sentence.</source>
            <target>Primera frase.</target>
          </segment>
          <ignorable id="2">
            <source>
    </source>
            <target>
    </target>
          </ignorable>
          <segment id="3">
            <source>Second sentence.</source>
            <target>La segunda frase.</target>
          </segment>
          <ignorable id="4">
            <source>
    </source>
            <target>
    </target>
          </ignorable>
          <segment id="5">
            <source>Third
    sentence.</source>
            <target>La tercera
    frase.</target>
          </segment>
        </unit>
        <unit id="string2">
          <segment id="1">
            <source>The user <pc id="1"><ph id="2"/></pc> deleted file <ph id="3"/>.</source>
            <target>El <pc id="1"><ph id="2"/></pc> usuario eliminado <ph id="3"/> archivo.</target>
          </segment>
          <ignorable id="2">
            <source>  </source>
            <target> </target>
          </ignorable>
          <segment id="3">
            <source>File <ph id="3"/> cannot be recovered.</source>
            <target><ph id="3"/> archivo no se puede recuperar.</target>
          </segment>
        </unit>
      </file>
    </xliff>

The product developer would expect to get this translated file back after translation, which maps to the version of the XLIFF file which he sent out for translation.


    Without <segment>.
      <?xml version="1.0" encoding="utf-8"?>
      <xliff version="2.0" segment="block">
        <file srclang="en" tgtlang="es" original="test.properties">
          <unit id="string1">
            <source>First sentence.
      Second sentence.
      Third
      sentence.</source>
            <target>Primera frase.
      La segunda frase.
      La tercera
      frase.</target>
          </unit>
          <unit id="string2">
            <source>The user <pc id="1><ph id="2"/></pc> deleted file <ph id="3"/>.  File <ph id="3"/> cannot be recovered.</source>
            <target>El <pc id="1"><ph id="2"/></pc> usuario eliminado <ph id="3"/> archivo. <ph id="3"/> archivo no se puede recuperar.</target>
          </unit>
        </file>
      </xliff>

    With <segment>.
      <?xml version="1.0" encoding="utf-8"?>
      <xliff version="2.0" segment="block">
        <file srclang="en" tgtlang="es" original="test.properties">
          <unit id="string1">
            <segment id="1">
              <source>First sentence.
      Second sentence.
      Third
      sentence.</source>
              <target>Primera frase.
      La segunda frase.
      La tercera
      frase.</target>
            </segment>
          </unit>
          <unit id="string2">
            <segment id="1">
              <source>The user <pc id="1><ph id="2"/></pc> deleted file <ph id="3"/>.  File <ph id="3"/> cannot be recovered.</source>
              <target>El <pc id="1"><ph id="2"/></pc> usuario eliminado <ph id="3"/> archivo. <ph id="3"/> archivo no se puede recuperar.</target>
            </segment>
          </unit>
        </file>
      </xliff>

Are these realistic examples?

David

Corporate Globalization Tool Development
EMail:  waltersd@us.ibm.com          
Phone: (507) 253-7278,   T/L:553-7278,   Fax: (507) 253-1721

CHKPII:                    http://w3-03.ibm.com/globalization/page/2011
TM file formats:     http://w3-03.ibm.com/globalization/page/2083
TM markups:         http://w3-03.ibm.com/globalization/page/2071


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]