[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: embedding XSLT elements in XLIFF source and target elements
Hi All
About a year ago, I raised some issues regarding translation of a peculiar file format that Oracle intends to be the first step in localizing a voice enabled application
This issue has recently resurfaced within the TC, as it impacts the whole issue of including xml structures from other name spaces directly within the source and target xliff elements
I have put together the following document which should refresh your memories.
In summary, i am suggesting hat XSLT, or relevant elements of XSLT be an allowed namespace within source and target as it allows some very intricate processing to be performed that gets round some translatability quality issues that have never really be addressed within I18n
I have used some very rough Java code as examples, but the issues apply to any other file type
Mat
------------------------------------------------------------
Consider a simple java method designed to report on the number of new emails received by a mail server application
public void printNewmails(int count){
System.out.println(“You have “ + count + “ new E-mail
message(s)”);
}
There are several problems here.
Firstly, the message uses a short-hand mechanism of handling plurals.
An improved mechanism would be:
public void printNewmails(int count){
if (count == 1){
System.out.println(“You have “ + count + “ new E-mail message”);
}
else {
System.out.println(“You have “ + count + “ new E-mail messages”);
}
}
This can be more easily localized
public void printNewmails(int count){
String message = null;
if (count == 1){
message = LoadResource(SINGLE_MESSAGE);
else {
message = LoadResource(MULTIPLE_MESSAGES);
}
message.format(count);
System.out.println(message);
}
But there is now a more subtle issue. Appending an ‘s’ to ‘message’ for “no messages” or more than one message is a convention that applies in English. It is most likely not applicable in other languages
For example, Irish appends a multiple for all non-zero values.
So the logic is different in Irish
i.e. if (count == 0){
…
}
else{
…
}
Other languages may use different logic or a different number of formats
For example, Polish uses five different resources depending on the count
An even more extreme example occurs if the system can be used to print the number of new e-mails, voice mails, or faxes
Public static int E_MAIL = 1;
Public static int VOICE_MAIL = 2;
Public static int FAX = 3;
public void printNewmails(int count, int type){
String message = null;
String messageType = null;
if (count == 1){
message = LoadResource(SINGLE_MESSAGE);
else {
message = LoadResource(MULTIPLE_MESSAGES);
}
if (type = E_MAIL){
messageType = LoadResource(E_MAIL_MESSAGE);
}
else if (type = VOICE_MAIL){
messageType = LoadResource(VOICE_MAIL _MESSAGE);
}
if (type = FAX){
messageType = LoadResource(FAX _MESSAGE);
}
message.format(count, messageType);
The complexity here is that some languages may change the initial message depending on the Gender of the message Type
This then requires even more complex code, with the code knowing the gender of each resource.
But how does a developer know or code for all the possible gender and linguistic variations . The localizer should make this decision
public void printNewmails(int count, int type, Locale loc){
LocalisedResource messageType = LoadLocalisedResource(type, loc)
LocalisedResource message = loadLocalisedResource(count,
messageType,
loc);
System.out.println(message.toString());
}
In this example, the first call to LoadLocalisedResource returns a structure containing the localized text of the resource for the required locale. The structure also has meta data containing the gender of the resource
The second call uses the gender of the message type to alter the language specific resource returned depending on the meta data of the language type.
But this still requires the developer to understand all these linguistic issues, and to know the gender of each message Type for all possible languages
To be fully internationalized, the decision on which message to use needs to be localized, as well as the messages themselves
Traditionally, this issue has been handled in software by designing the applications so that these issues do not occur
public void printNewmails(int count, int type){
will output:
New Email message count : 3
Or
New Fax message count : 0
But if the UI quality is to be of a higher standard, or is to be converted into a voice enabled application, this level of quality is insufficient.
For voice enable applications, further issues arise, in that the count values also need to be localized as “One”, “Two” etc
As stated above, to achieve the highest possible quality in localizing this type of application is to localize the decision logic as well as the resources
One way to do this would be to introduce the linguistic Meta data and logic into the Translatable Xliff structures themselves
Firstly, when translating “E_MAIL”, translators can add gender meta tags” to the resource.
Secondly when translating the actual messages, the if (count ==) logic can be implemented using XSLT choose structures
The choose structures can pick up input values, such as the count, or meta tags from other structures
Consider the following XSLT structure
<var name="numVM"
type="integer"/>
<choose>
<when
test="numVM==0">
You have no new messages.
</when>
<when
test="numVM==1"/>
You have one new message.
</when>
<otherwise>
You have <varref name="numVM"/> new
messages.
</otherwise>
</choose>
And its Polish
translation:
<var name="numVM"
type="integer"/>
<choose>
<when
test="numVM=0">
Nie masz
nowych wiadomości
</when>
<when
test="numVM=1"/>
Masz
jedną nową wiadomość
</when>
<when
test="numVM<5"/>
Masz
<varref name="numVM" variant="feminine_accusative"/> nowe
wiadomości
</when>
<when
test="numVM<21"/>
Masz
<varref name="numVM" variant="feminine_accusative"/> nowych
wiadomości
</when>
<when test="in(numVM %
10, [2,3,4])"/>
Masz
<varref name="numVM" variant="feminine_accusative"/> nowe
wiadomości
</when>
<otherwise>
Masz
<varref name="numVM" variant="feminine_accusative"/> nowych
wiadomości
</otherwise>
</choose>
Now consider
how this could be transported in an Xliff
document
The obvious
solution is to include a trans unit for each of the three English
variants.
But this
requires six trans units in Polish
The polish
translator could add the extra trans units. We include the <when>
conditions in the trans unit id’s
This becomes
awkward for reuse.
If the second
English resource is modified in later versions, which of the six polish Tran
units can be reused, and which need
retranslation.
It also
requires that the Xliff editing tool be capable and allowed to add extra trans
units
A simpler
variation may be to put the entire structure as the translatable text in a
single XLIFF trans unit
The translator
then localizes a single XSLT
fragment for each resource, and the transaltaed Xliff file then
becvomes
<xliff
version='1.1'
xmlns='urn:oasis:names:tc:xliff:document:1.1'>
<file original='hello.txt'
source-language='en' target-language='pl'
datatype='plaintext'>
<body>
<trans-unit
id='new_messages'>
<source><var
name="numVM" type="integer"/>
<choose>
<when
test="numVM==0">
You
have no new messages.
</when>
<when
test="numVM==1"/>
You
have one new message.
</when>
<otherwise>
You
have <varref name="numVM"/> new
messages.
</otherwise>
</choose></source>
<target><var
name="numVM" type="integer"/>
<choose>
<when
test="numVM=0">
Nie masz nowych
wiadomosci
</when>
<when
test="numVM=1"/>
Masz jedna nowa
wiadomosc
</when>
<when
test="numVM<5"/>
Masz <varref
name="numVM" variant="feminine_accusative"/> nowe
wiadomosci
</when>
<when
test="numVM<21"/>
Masz <varref
name="numVM" variant="feminine_accusative"/> nowych
wiadomosci
</when>
<when test="in(numVM %
10, [2,3,4])"/>
Masz <varref
name="numVM" variant="feminine_accusative"/> nowe
wiadomosci
</when>
<otherwise>
Masz <varref
name="numVM" variant="feminine_accusative"/> nowych
wiadomosci
</otherwise>
</choose></target>
</trans-unit>
</body>
</file>
</xliff>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]