OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: embedding XSLT elements in XLIFF source and target elements


 

Hi All

 

About a year ago, I raised some issues regarding translation of a peculiar file format that Oracle intends to be the first step in localizing a voice enabled application

 

This issue has recently resurfaced within the TC, as it impacts the whole issue of including xml structures from other name spaces directly within the source and target xliff elements

 

I have put together the following document which should refresh your memories.

 

In summary, i am suggesting hat XSLT, or relevant elements of XSLT be an allowed namespace within source and target as it allows some very intricate processing to be performed that gets round some translatability quality issues that have never really be addressed within I18n

 

I have used some very rough Java code as examples, but the issues apply to any other file type

 

Mat

 

------------------------------------------------------------

 

Translation quality and embedding XSLT within XLIFF

 

Consider a simple java method designed to report on the number of new emails received by a mail server application

 

public void printNewmails(int  count){

            System.out.println(“You have “ + count + “ new E-mail message(s)”);

}

 

There are several problems here.

 

Firstly, the message uses a short-hand mechanism of handling plurals.

 

An improved mechanism would be:

 

public void printNewmails(int  count){

            if (count == 1){

            System.out.println(“You have “ + count + “ new E-mail message”);

}

else {

            System.out.println(“You have “ + count + “ new E-mail messages”);

}

}

 

This can be more easily localized

 

public void printNewmails(int  count){

            String message = null;

            if (count == 1){

            message = LoadResource(SINGLE_MESSAGE);

else {

            message = LoadResource(MULTIPLE_MESSAGES);

}

message.format(count);

System.out.println(message);

}

 

But there is now a more subtle issue. Appending an ‘s’ to ‘message’ for “no messages” or more than one message is a convention that applies in English. It is most likely not applicable in other languages

 

For example, Irish appends a multiple for all non-zero values.

 

So the logic is different in Irish

 

 i.e. if (count == 0){

}

else{

}

 

Other languages may use different logic or  a different number of formats

 

For example, Polish uses five different resources depending on the count

 

 

An even more extreme example occurs if the system can be used to print the number of new e-mails, voice mails, or faxes

 

 

Public static int E_MAIL = 1;

Public static int VOICE_MAIL = 2;

Public static int FAX = 3;

 

public void printNewmails(int  count, int type){

            String message = null;

String messageType = null;

 

            if (count == 1){

            message = LoadResource(SINGLE_MESSAGE);

else {

            message = LoadResource(MULTIPLE_MESSAGES);

}

 

if (type = E_MAIL){

            messageType = LoadResource(E_MAIL_MESSAGE);

}

else if (type = VOICE_MAIL){

            messageType = LoadResource(VOICE_MAIL _MESSAGE);

}

if (type = FAX){

            messageType = LoadResource(FAX _MESSAGE);

}

 

           

message.format(count, messageType);

 

 

The complexity here is that some languages may change the initial message depending on the Gender of the message Type

 

This then requires even more complex code, with the code knowing the gender of each resource.

But how does a developer know or code for all the possible gender and linguistic variations . The localizer should make this decision

 

public void printNewmails(int  count, int type, Locale loc){

 

LocalisedResource messageType = LoadLocalisedResource(type, loc)

LocalisedResource message = loadLocalisedResource(count,

messageType,

loc);

           

            System.out.println(message.toString());

}

 

In this example, the first call to LoadLocalisedResource returns a structure containing the localized text of the resource for the required locale. The structure also has meta data containing the gender of the resource

 

The second call uses the gender of the message type to alter the language specific resource returned depending on the meta data of the language type.

 

 

But this still requires the developer to understand all these linguistic issues, and to know the gender of each message Type for all possible languages

 

 

To be fully internationalized, the decision on which message to use needs to be localized, as well as the messages themselves

 

 

Traditionally, this issue has been handled in software by designing the applications so that these issues do not occur

 

 

public void printNewmails(int  count, int type){

 

will output:

            New Email message count : 3

Or

            New Fax message count : 0

 

But if the UI quality is to be of a higher standard, or is to be converted into a voice enabled application, this level of quality is insufficient.

 

For voice enable applications, further issues arise, in that the count values also need to be localized as “One”, “Two” etc

 

As stated above, to achieve the highest possible quality in localizing this type of application is to localize the decision logic as well as the resources

 

One way to do this would be to introduce the linguistic Meta data and logic into the Translatable Xliff  structures themselves

 

Firstly, when translating “E_MAIL”, translators can add gender meta tags” to the resource.

 

Secondly when translating the actual messages, the if (count ==) logic can be implemented using XSLT choose structures

 

The choose structures can pick up input values, such as the count, or meta tags from other structures

 

Consider the following XSLT structure

 

<var name="numVM" type="integer"/>
    <choose>
      <when test="numVM==0">
        You have no new messages.
      </when>
      <when test="numVM==1"/>
        You have one new message.
      </when>
      <otherwise>
        You have <varref name="numVM"/> new messages.
      </otherwise>
    </choose>

 

And its Polish translation:

  <var name="numVM" type="integer"/>
  <choose>
    <when test="numVM=0">
      Nie masz nowych wiadomości
    </when>
    <when test="numVM=1"/>
      Masz jedną nową wiadomość
    </when>
    <when test="numVM<5"/>
      Masz <varref name="numVM" variant="feminine_accusative"/> nowe wiadomości
    </when>
    <when test="numVM<21"/>
      Masz <varref name="numVM" variant="feminine_accusative"/> nowych wiadomości
    </when>
    <when test="in(numVM % 10, [2,3,4])"/>
      Masz <varref name="numVM" variant="feminine_accusative"/> nowe wiadomości
    </when>
    <otherwise>
      Masz <varref name="numVM" variant="feminine_accusative"/> nowych wiadomości
    </otherwise>
  </choose>

Now consider how this could be transported in an Xliff document

 

The obvious solution is to include a trans unit for each of the three English variants.

But this requires six trans units in Polish

 

The polish translator could add the extra trans units. We include the <when> conditions in the trans unit id’s

 

This becomes awkward for reuse.

 

If the second English resource is modified in later versions, which of the six polish Tran units can be reused, and which need retranslation.

 

It also requires that the Xliff editing tool be capable and allowed to add extra trans units

 

A simpler variation may be to put the entire structure as the translatable text in a single XLIFF trans unit

 

The translator then localizes a single  XSLT fragment for each resource, and the transaltaed Xliff file then becvomes

 

 

<xliff version='1.1'

 xmlns='urn:oasis:names:tc:xliff:document:1.1'>

 <file original='hello.txt' source-language='en' target-language='pl'

  datatype='plaintext'>

  <body>

   <trans-unit id='new_messages'>

    <source><var name="numVM" type="integer"/>

    <choose>

      <when test="numVM==0">

        You have no new messages.

      </when>

      <when test="numVM==1"/>

        You have one new message.

      </when>

      <otherwise>

        You have <varref name="numVM"/> new messages.

      </otherwise>

    </choose></source>

    <target><var name="numVM" type="integer"/>

  <choose>

    <when test="numVM=0">

      Nie masz nowych wiadomosci

    </when>

    <when test="numVM=1"/>

      Masz jedna nowa wiadomosc

    </when>

    <when test="numVM<5"/>

      Masz <varref name="numVM" variant="feminine_accusative"/> nowe wiadomosci

    </when>

    <when test="numVM<21"/>

      Masz <varref name="numVM" variant="feminine_accusative"/> nowych wiadomosci

    </when>

    <when test="in(numVM % 10, [2,3,4])"/>

      Masz <varref name="numVM" variant="feminine_accusative"/> nowe wiadomosci

    </when>

    <otherwise>

      Masz <varref name="numVM" variant="feminine_accusative"/> nowych wiadomosci

    </otherwise>

  </choose></target>

   </trans-unit>

  </body>

 </file>

</xliff>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]