id
and resname
attributes?XLIFF is a specification for the loss-less interchange of localizable data and its related information. It is tool-neutral, has been formalized as an XML vocabulary (through XML schema), and features an extensibility mechanism.
A white-paper describing how to use XLIFF is available for download at the
following URL:
http://www.oasis-open.org/apps/group_public/download.php/3110/XLIFF-core-whitepaper_1.1-cs.pdf
XLIFF addresses the following localisation challenges:
XLIFF offers customers of localisation services the following advantages:
XLIFF offers localisation tools vendors the following advantages:
XLIFF offers localisation services providers the following advantages:
The most recent version of XLIFF is version 1.1, and was approved as an OASIS Committee Specification on 20 May 2003.
The latest specification for XLIFF can always be found at:
http://www.oasis-open.org/committees/xliff/documents/xliff-specification.htm
LISA, the Localisation Industry Standards Association, defines Localisation as follows:
"Localisation involves taking a product and making it linguistically and culturally appropriate to the target locale (country/region and language) where it will be used and sold."
For more information on localisation and related topics see:
XLIFF is based on the concept of extracting the source localisation-related data from the original format, and merging it back in place after the localisation has been done. Extraction and merge routines must be developed for each native data type as file filters, or XSL scripts.
Some open source and commercially available tools provide built-in support for the more common resource types, so ask your tools vendor what resource types they support.
Extraction/Merge principle:
The parts that are not related to localisation are preserved temporarily into the "Skeleton". There are no rules on how to represent the data in the Skeleton itself, this is left to the discretion of the filters. XLIFF 1.1 focuses on how to store and organize the extracted parts.
Skeletons can be either embedded directly in the XLIFF document with the <internal-file>
element or simply referred to with the <external-file>
element.
You can validate XLIFF documents using the XSD schema provide for this.
The XLIFF schema is available at:
http://www.oasis-open.org/committees/xliff/documents/xliff-core-1.1.xsd
The translatable content ("resources") of an application, database or website is first extracted, translated or modified for a given language or market and finally rebuilt or redeployed. Numerous commercial tools are available to optimise and reduce the cost of translation.
This use case describes a very primitive localisation process. In this example, a developer writes code, and hands it off to a localisation engineer. All of the process complexity exists in the localisation domain. The localisation engineer receives all of the resources in their original native format. In order for the native files to be localised, tool filters must be available that interpret the localizable resources in the native file, or possibly complicated multi-tool solutions are required in order to translate all the native files.
Typical localisation workflow without XLIFF:
Each time a new native format is introduced or when an existing one is changed, localisation tools engineers who may not be experts in the native format must revise the tool and/or filter. And since new or changed resource types are generally discovered when the tools fail during the midst of a project, supporting internal localisation tools is a fire fight.
This model is highly reactive, and will inevitably result in project delays and costs due to frequent retooling. It is also more likely introduce potential poor quality of translated work due to misinterpreting data when converting between native format and the localisation tool's internal data representation.
Below is a use case that illustrates how XLIFF can be used to improve the localization process:
Localisation workflow with XLIFF:
In this model, an XLIFF compliant tool outputs directly to XLIFF and this file is handed off to the localisation engineers. Another scenario may be that developers output their work to native files as before, but before the files are handed off for localisation a pre-processor converts the data into XLIFF. In each of these use cases, when new formats are introduced into the development process or existing ones are changed, developer/publishers are responsible for handing off the data as XLIFF.
This proactive model simplifies the formats that localisation tools must support, and removes process complexity in the localisation engineering domain. It also places the responsibility for converting the native data to XLIFF with those who are most knowledgeable about the native format.
A more advance implementation is illustrated below.
Automated workflow with XLIFF and CAT tools:
This use case further extends the workflow to include CAT (Computer Aided
Translation) tools. In this scenario, the XLIFF files are moved through the
workflow as before, but additionally translation memory fuzzy matches may be
added to the XLIFF file as <alt-trans>
, and additionally
machine translations may also be added. XLIFF tools that support <alt-trans>
may present to the translator these "alternative translations" to
enhance their productivity. Additionally, reference to related glossary data
can be stored in the XLIFF file and handed off to the translator.
An XLIFF document contains essentially data that need to be modified in order to localise the original resources from which the document is created. For example:
It can also contain metadata (information about the data) such as:
An XLIFF document is composed of one <file>
element or
more. Each <file>
element corresponds to an original data
source, for example a properties file, a database table, a graphic file, an
HTML document, etc.
A <file>
element is composed of a optional <header>
and a body <body>
. The header is used to store file-level
information, the body contains the data to localise.
The translatable data are stored in <trans-unit>
elements, which can be organised in any levels of <group>
elements. Binary data can also be stored in the file, in <bin-unit>
elements.
Example of XLIFF document:
<xliff version='1.1' xmlns="urn:oasis:names:tc:xliff:document:1.1"> <file original="file1.prop" source-language="en-US" datatype="javapropertyresourcebundle"> <header> <skl><external-file href="file1.prop"/></skl> </header> <body> <trans-unit id="1" resname="id1"> <source xml:lang="en-US">Text of string 1.</source> </trans-unit> <trans-unit id="2" resname="id2"> <source xml:lang="en-US">Text of string 2.</source> </trans-unit> </body> </file> <file original="file2.prop" source-language="en-US" datatype="javapropertyresourcebundle"> <header> <skl><external-file href="file2.prop"/></skl> </header> <body> <trans-unit id="1" resname="id1bis"> <source xml:lang="en-US">String 1 file 2.</source> </trans-unit> <trans-unit id="2" resname="id2bis"> <source xml:lang="en-US">String 2 file 2.</source> </trans-unit> </body> </file> </xliff>
See more example of XLIFF documents here.
An XLIFF document is normally a bilingual file. It has one source language (the language of the original extracted file), and one target language.
However, the <alt-trans>
elements can be in language
other than the source or target one. This is to allow the document to carry
translation candidates (and their own source text) in multiple languages as
the example shows below:
<trans-unit id='1'> <source xml:lang='fr-fr'>Nouvelle couleur</source> <alt-trans match-quality='100%'> <source xml:lang='fr-ca'>Nouvelle couleur</source> <target xml:lang='en-us'>New color</target> </alt-trans> <alt-trans match-quality='100%'> <source xml:lang='fr-cm'>Nouvelle couleur</source> <target xml:lang='en-au'>New colour</target> </alt-trans> </trans-unit>
In addition, since source and target language are defined at the <file>
element level, and an XLIFF document can contain several <file>
elements, it is technically possible to have an XLIFF document with more than
one source and one target language.
Yes, it is possible to have user-defined elements and/or attributes in a valid XLIFF document. You can do this by using the XML namespace mechanism.
The following elements allow non-XLIFF elements: <header>
,
<group>
, <tool>
, <trans-unit>
,
<alt-trans>
, and <bin-unit>
.
The following elements allow non-XLIFF attributes: <file>
,
<group>
, <trans-unit>
, <source>
,
<target>
, <tool>
, <bin-unit>
,
<bin-source>
, <bin-target>
, <alt-trans>
,
<mrk>
, <g>
, <x/>
, <bx/>
,
<ex/>
, <bpt>
, <ept>
, <ph>
,
and <it>
.
Example of an XLIFF document with a private namespace (in bold):
<xliff version='1.1' xmlns='urn:oasis:names:tc:xliff:document:1.1' xmlns:xyz='www.mycompany.com/xyzext.1.1'> <file original='Project.grf' source-language='en' datatype='plaintext' xyz:srcroot='C:\Projects\Fiji\Images\en' xyz:autolink='yes' xyz:work='Thumbnails'> <trans-unit id='jpemphasis.gif' xyz:screenshot='no'> <source xml:lang='en'>Emphasis marks</source> </trans-unit> <trans-unit id='btnHome.png' xyz:screenshot='no'> <source xml:lang='en'>Home</source> </trans-unit> <trans-unit id='btnSearch.png' xyz:screenshot='no'> <source xml:lang='en'>Search</source> </trans-unit> </file> </xliff>
See the section "Extensibility" in the specification for more information and examples.
Yes, it is possible to have user-defined attribute values in the following
attributes: context-type
, count-type
, ctype
,
datatype
, mtype
, restype
, size-unit
,
state
, state-qualifier
, unit
, priority
,
and purpose
.
User-defined values must start with an "x-
" prefix.
Example of an XLIFF document with user-defined values (in bold):
<xliff version='1.1' xmlns='urn:oasis:names:tc:xliff:document:1.1'> <file original='mydata.slk' source-language='en' datatype='x-excel-slk'> <trans-unit id='1:1' restype='x-const'> <source xml:lang='en'>Root =</source> </trans-unit> <trans-unit id='1:1' restype='x-const'> <source xml:lang='en'>Number of files =</source> </trans-unit> </file> </xliff>
See the section "Extensibility" in the specification for more information and examples.
id
and resname
attributes?The id
attribute is used to link a <trans-unit>
or an inline element to its original location in the source file from which the
XLIFF document was produced. The id
attribute values are
determined by the tool that created the extracted document, they may or may not
be the same as the values of the resname
attribute.
The resname
attribute holds the original identifier of the text
item extracted in the <trans-unit>
element. For example,
with this small properties file:
mnuItemFile=File mnnItemFileOpen=Open...
Some tools may use their own mechanism to link extracted data and the original file:
<trans-unit id='1' resname='mnuItemFile'> <source>File</source> </trans-unit> <trans-unit id='2' resname='mnuItemFileOpen'> <source>Open</source> </trans-unit>
and some others may choose to use the property key:
<trans-unit id='mnuItemFile' resname='mnuItemFile'> <source>File</source> </trans-unit> <trans-unit id='mnuItemFileOpen' resname='mnuItemFileOpen'> <source>Open</source> </trans-unit>
The recommended extension for XLIFF documents is ".xlf
".
TMX (Translation Memory eXchange format) is a standard to exchange
translation memory content between tools. A collection of <tu>
elements in TMX has no specific order and contains no mechanism to rebuild the
original file.
Both formats have some elements in common, especially regarding the inline mark-up elements, but they are variations in the attributes of those elements. TMX uses only the encapsulation methods for inline codes (there native codes are enclosed within different elements), while XLIFF provides both the encapsulation method (using elements very similar to TMX's) and the placeholder method (where the native codes are removed to the Skeleton file and replaced by a short element that refers to them, using elements very similar to OpenTag's). TMX allows any number of languages in the same document. XLIFF is designed to work with one source and one target language.
TMX can be used in the same framework as XLIFF, for example to carry a translation memory along with the data to localise.
There are different ways to translate an XLIFF documents:
Such tools have support to read XLIFF document and take advantage of all or
most XLIFF features, such as the pre-translated strings available in <alt-trans>
elements, and so forth.
Such tools do not require any specific pre-processing of the XLIFF document.
Any translation that supports XML can be used to translate an XLIFF file. However, depending on the capabilities of the tool, you may have to ensure a few things in the XLIFF document.
In XLIFF the source text is in the <source>
element, and
the translated text must go in the <target>
element. Many
XML-enable tools cannot place the translation of a text in an element
different from where the source was taken, therefore you want to make sure the
XLIFF document has a <target>
element with the original
text to translate in each translation unit.
In XLIFF translation units can be marked as "no to be translated", as in the example below:
<trans-unit translate="no"> <source xml:lang='en'>Non-translatable text</source> <target xml:lang='mn'>Non-translatable text</target> </trans-unit>
Many XML-enabled tools cannot specify that a text is to be translated or
not based on a condition, but rely only on element and attribute names. To
work around this limitation, you must add a temporary element to allow the
tool to detect the parts that should be protected. In the example below, a
temporary element <NTBT>
(not to be translated) has been
added to enclose the protected text:
<trans-unit translate="no"> <source xml:lang='en'>Non-translatable text</source> <target xml:lang='mn'><NTBT>Non-translatable text</NTBT></target> </trans-unit>
Such pre-processing can be done very easily by applying the following XSL template to the XLIFF document:
<?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output encoding="utf-8" /> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> </xsl:template> <xsl:template match="//trans-unit[@translate='no']/target"> <xsl:copy> <xsl:apply-templates select="@*"/> <NTBT><xsl:apply-templates/></NTBT> </xsl:copy> </xsl:template> </xsl:stylesheet>
You must also make sure to remove any temporary elements you have added in the XLIFF document before it comes back to the tool that will generate the final localised file.
Here are a few examples of XLIFF documents: