This document explains why it is important to support segmentation (multiple source and translation parts inside the <trans-unit>) in XLIFF files, by giving some examples of scenarios in the localisation process where this would be beneficial or necessary.
Throughout the scenarios a particular representation of segmentation in XLIFF is used to illustrate the concepts. The representation used in these examples shows only one way that support for segmentation could be implemented. Many different alternatives will likely be discussed and evaluated before the XLIFF segmentation sub-committee reaches a stage where we can propose a recommendation to the XLIFF Technical Committee on how to represent segmentation in XLIFF documents.
This section describes a simplified localisation workflow, with the different actors involved and the flow of data between them.
Throughout the process we will consider a sample document in XLIFF format to see how it evolves during the process.
The first step in the process of localising content happens at the content owner, and consists of identifying and extracting the localisable content into a format that is suitable for translation. In this example XLIFF is used as output.
The process of conversion of data into XLIFF is termed Filtering. The job is accomplished by a piece of software called a filter. The responsibility of the filter is to separate translatable content from non-translatable content. The filter identifies “hard boundaries” between the different pieces of localisable content. These boundaries are seen in the XLIFF file as the <trans-unit> elements.
It is worth noting that the filter should generally not be concerned about trying to divide the translatable content into sentences. There are a couple of good reasons for this:
Identifying sentence boundaries is not a trivial task. (See the OSCAR web site for the SRX working group for some nasty examples.)
Implementing segmentation rules in the filter may take considerable efforts, and makes the filtering process unnecessarily complex.
Segmentation rules implemented in one filter are likely to differ from segmentation rules used by a different filter, or even a different version of the same filter. Different segmentation will affect the ability to re-use translations originating from other XLIFF files of the same or different file type.
Segmentation is dependent on the language of the content. If segmentation rules are implemented in the filter it means that the filter can only ever be used to segment content in the specific languages it supports. This is an unnecessary limitation.
There is no reason for a filter to introduce hard boundaries between sentences. Though this may at first seem like a good idea it introduces a lot of inflexibility.
Sometimes during translation multiple source languages sentences must be translated as a single target language sentence. If the different source language sentences appear in different <trans-unit> elements, there is no safe way to accomplish this and still guarantee that the translated content will be appear correct once converted back to the original format. Translation tools working on the XLIFF files cannot make any assumptions about how the content of individual <trans-unit> elements will be used, as that process is entirely defined by the filter.
Flexibility of re-use is limited. For example it is no longer possible to apply re-use from paragraph based translation sources.
The filtering may involve interaction with a Content Management System (CMS) and/or conversion of data from file formats such as HTML, XML, Microsoft Word, etc. The output is an XLIFF file, which for the sake of our example would have the following content (only the <body> element is shown for simplicity):
<body> <trans-unit id="1"> <source xml:lang="en-US">The Document Title</source> </trans-unit> <trans-unit id="2"> <source xml:lang="en-US">First sentence. <bpt id="1">[ITALIC:</bpt>This is an important sentence.<ept id="1">]</ept></source> </trans-unit> <trans-unit id="3"> <source xml:lang="en-US">Ambiguous sentence. More <bpt id="1">[LINK-to-toc:</bpt>content<ept id="1">]</ept>.</source> </trans-unit> </body>
Here we can see that the filter has produced three <trans-unit> elements in the XLIFF file. One seems to contain the document title, and the second and the third contains the first paragraphs of the document.
To make things a little bit more interesting, the important sentence in the middle of the second <trans-unit> were formatted in italic. In this example we can see that the italic formatting is applied using some unconventional text based tagging that seems to start with [ITALIC: and end with ]. The filter has identified these special tags and wrapped them into <bpt> and <ept> elements, which means that any XLIFF compliant tool can process them without needing to know anything more about them.
In a similar manner the word “content” in the third <trans-unit> has been tagged as a link. Also this was identified by the filter, and the tagging has been wrapped in <bpt> and <ept> elements.
The XLIFF file is now passed on to the localisation agency, possibly along with additional data such as existing translation memories to use for the translation, previous source and localised versions of the same content, standardised terminology to be used, and any additional instructions. They would also specify which languages the content should be translated into, request a quote, etc.
The localisation agency interacts with the content owner and coordinates the entire localisation process. In some cases a dedicated department in the same company as the content owner fulfils this role, and in other cases external localisation agencies are contracted to do this job.
The first thing the localisation agency does when it receives content for translation is to determine the scope of the job. This typically involves getting a source language word count for the translatable content. If similar previously translated content is available and applicable it also involves finding out how much of the previously translated content can be re-used for each of the target languages.
The word count can be performed by any word counting tool that supports XLIFF. Here we assume that the localisation agency uses a word counting tool that produces a standardised word count that is accepted as the norm for the industry. (Word counting is also a tricky business, and different tools often produce different word counts for the same content. Discrepancies arise from how punctuation, dates, numbers, embedded tags etc. are treated.) The total word count for the XLIFF file in this example is 14. If desirable the tool could also annotate the XLIFF file with word count information.
In this example the localisation agency are lucky enough to have access to a set of translation memories that were used when this content was last translated. The content of the US English to Swedish translation memory (exported into TMX) looks like this (only relevant parts of the <body> element is shown for simplicity):
<tmx version="1.x"> ... <body> <tu id="1"> <tuv lang="EN-US"> <seg>The Document Title</seg> </tuv> <tuv lang="SV-SE"> <seg>Dokumentrubriken</seg> </tuv> </tu> <tu id="2"> <tuv lang="EN-US"> <seg>This is an extremely important sentence.</seg> </tuv>1 <tuv lang="SV-SE"> <seg>En mycket viktig mening.</seg> </tuv> </tu> <tu id="3"> <tuv lang="EN-US"> <seg>First sentence.</seg> </tuv> <tuv lang="SV-SE"> <seg>Första meningen.</seg> </tuv> </tu> <!-- and a lot of other translation units --> </body> </tmx>
The TMX file also contains a reference to an SRX file that specifies the segmentation rules used when this translation memory was created. (If there is no SRX file it may be necessary to consult other sources of information to find out which segmentation rules were used while building up the translation memory.) For this example it is sufficient to assume that the SRX file contains rules that specify sentence based segmentation.
To estimate the re-use from the previously translated Swedish content the localisation agency makes use of a translation memory tool that supports TMX, SRX, and XLIFF.
The SRX and TMX data is used to initialise the translation memory tool. Then the XLIFF file is passed through the translation memory tool to estimate how much of the previously translated content can be re-used.
The translation memory tool iterates over the <trans-unit> elements in the XLIFF file, and for each <trans-unit> does the following:
The segmentation rules from the SRX file are applied to determine how the content in the <source> element should be divided (segmented) in order to achieve the best possible fit against the translation memory data originating from the TMX file.
For each of the identified pieces (segments) of source content the translation memory tool does a lookup in the translation memory, and records the result. The result is typically a percentage value from 0-100% indicating how well the best matching segment in the translation memory corresponds to the document segment.
The recorded re-use data is categorised and presented in a report that the localisation agency uses to determine the scope of the project. This data is used to produce quotes and to do resource planning.
(The translation memory tool may also do additional processing such as counting the number of times each source sentence with 0% match appears in the XLIFF file (repetitions), and apply various smart algorithms to optimise re-use of texts that contain numbers, dates, tags, etc. This often leads to increased re-use.)
The re-use report for Swedish with this particular XLIFF file and TMX + SRX could look like this:
Match Type Segments Words Percent 100% 2 5 36 95% - 99% 0 0 0 85% - 94% 0 0 0 75% - 84% 1 5 36 50% - 74% 0 0 0 No Match 2 4 28 Total 5 14 100
Here we can see that 2 segments were considered 100% matches (a total of 5 words, which constitutes 36% of the translatable content in the XLIFF file), 1 segment was a fuzzy match (with a similarity degree of 75-84% between the translation memory and the XLIFF file content), and two segments did not have any match in the translation memory.
These reports would be produced for each target language the content is to be translated into.
The next step in the localisation process is to prepare for the translation step. This is performed by the localisation agency, and includes the following preparation work for the translatable content in the XLIFF file:
Other parts of the preparation work for translation includes identifying important terminology, compiling glossaries, organising translation memory data, determining which settings should be used for the localisation tools, identifying parts of the translatable content that require special considerations, and putting together instructions for the translators.
Many parts of the preparation work needs to be done for each language the content will be translated into.
Let's look at the XLIFF preparation for the Swedish translation. Identifying the best segment boundaries for the first <trans-unit> is easy, as there is a 100% match for the entire source text. Thus we can use the standard XLIFF mechanism and specify the translation by simply inserting the Swedish translation as a <target>, e.g. like this:
<trans-unit id="1">
<source xml:lang="en-US">The Document Title</source>
<target xml:lang="sv-SE" state="translated" state-qualifier="leveraged-tm">Dokumentrubriken</target>
</trans-unit>
Looking at the second <trans-unit>:
<trans-unit id="2"> <source xml:lang="en-US">First sentence. <bpt id="1">[ITALIC:</bpt>This is an important sentence.<ept id="1">]</ept></source> </trans-unit>
we notice that the first sentence has a 100% match. If we ignore the <bpt> and <ept> surrounding the second sentence we have a good fuzzy match for it.
For this example let's assume that the tool processing the XLIFF file is able to detect this and apply the corresponding sentence-based segmentation.
Assuming all content needs to be part of a segment, the content could be divided as follows (each segment has been put on its own line here to make it easier to read):
First sentence. <bpt id="1">[ITALIC:</bpt> This is an important sentence. <ept id="1">]</ept>
Each segment will have a translation, and could also have alternative translations and/or fuzzy matches, in the same way as a <trans-unit>. Thus it would make sense to divide the content of the <trans-unit> into segments, like this:
<trans-unit id="2"> <segment> <source xml:lang="en-US">First sentence.</source> </segment> <segment> <source xml:lang="en-US"> <bpt id="1">[ITALIC:</bpt></source> </segment> <segment> <source xml:lang="en-US">This is an important sentence.</source> </segment> <segment> <source xml:lang="en-US"><ept id="1">]</ept></source> </segment> </trans-unit>
This allows the tool to apply the translation memory based re-use to the <trans-unit> content in a natural way, making use of the same mechanisms as are available to the <trans-unit> element:
<trans-unit id="2"> <segment> <source xml:lang="en-US">First sentence.</source> <target xml:lang="sv-SE" state="translated" state-qualifier="leveraged-tm">Första meningen.</target> </segment> <segment> <source xml:lang="en-US"> <bpt id="1">[ITALIC:</bpt></source> </segment> <segment> <source xml:lang="en-US">This is an important sentence.</source> <alt-trans origin="transation memory" match-quality="80%"> <source xml:lang="en-US">This is an extremely important sentence.</source> <target xml:lang="sv-SE">En mycket viktig mening.</target> </alt-trans> </segment> </trans-unit>
Let's now look at the third <trans-unit>. Here there are no matches from the translation memory. Depending on settings, the translation memory tool can either leaving it as is, or divide the content into segmentation according to the algorithm used by the translation memory tool, introducing segments for the different components.
Here are a couple of reasons for why to have also the non-matching content segmented by the translation memory tool:
If the same translation memory will be updated and applied during the translation process the new translation memory entries can be directly applied to the content of the file.
If the translation memory is used interactively during the translation it is important that the new content added to it follows the same segmentation rules as the rest of the translation memory content.
If the translated files are used to update and/or populate the translation memory after the translation process is finished, only segments that follow the same rules as the translation memory will be useful.
Segmentation of untranslated content can facilitate use of other translation memory tools that may not have the exact same segmentation rules, but for which some content would still be segmented in the same way. It is possible to translate the document using other translation memory tools and still update the original translation memory based on the translations, without any differences in segmentation between old and new translation memory content.
In this example we assume that the tool has been set up to segment also content that has no matches in the translation memory. The result could look like this:
<trans-unit id="3"> <segment> <source xml:lang="en-US">Ambiguous sentence.</source> </segment> <segment> <source xml:lang="en-US"> </source> </segment> <segment> <source xml:lang="en-US">More <bpt id="1">[LINK-to-toc:</bpt>content<ept id="1">]</ept>.</source> </segment> </trans-unit>
An interesting fact to note is that the space character between the sentences has ended up in its own segment. There are a couple of reasons for that:
It would not make sense to include it at the beginning or end of a translation unit containing a sentence. That would limit the full re-usability of that sentence to cases where it appears before or after another sentence.
It cannot be omitted. That could result in the output document having no space character between the sentences.
In some cases it actually needs to be localised. For example in US English texts a common practice is to use double spaces between sentences. This is not the case in many other languages. In such cases the US “ ” (double space between sentences) should be localised into “ ” (single space between sentences) or vice versa.
The pre-translated and segmented document is now ready for translation. It will be sent together with all other related material to the translator.
Translation is performed by a qualified translator. Translators may work on-site at a localisation agency or department, or off-site either in teams in separate companies contracted by the localisation agency, or as individual freelance translators.
The translator will make use of an XLIFF aware translation editor tool to accomplish the job. As part of the process they will get familiar with and explore specific concepts and terminology used in the document and identify their specific translations. If available they will also make use of validation and verification tools to ensure the validity of the file during and after translation. The verification tools can be generic and simply verify a set of predefined features in the XLIFF document, but often it is necessary for them to be specialised on the actual file format being translated (i.e. the file format before it is turned into XLIFF).
The translation editor may also integrate with other Computer Aided Translation (CAT) tools such as a translation memory, active term recognition, machine translation systems etc.
Given that the document is segmented per sentences, the translator typically works through the document dealing with each sentence needing attention in sequence. This typically provides as much context information as possible, which is often vital to produce a good translation.
Some times it may not be clear how a particular sentence should be translated. This could be due to ambiguities in the text, unknown terminology, or too little context information available. When this happens the translator sometimes need to communicate with the client to clarify this.
At the time the translation is started the document in our example looks like this (everything except the document body omitted for clarity):
<body> <trans-unit id="1"> <source xml:lang="en-US">The Document Title</source> <target xml:lang="sv-SE" state="translated" state-qualifier="leveraged-tm">Dokumentrubriken</target> </trans-unit> <trans-unit id="2"> <segment> <source xml:lang="en-US">First sentence.</source> <target xml:lang="sv-SE" state="translated" state-qualifier="leveraged-tm">Första meningen.</target> </segment> <segment> <source xml:lang="en-US"> <bpt id="1">[ITALIC:</bpt></source> </segment> <segment> <source xml:lang="en-US">This is an important sentence.</source> <alt-trans origin="transation memory" match-quality="80%"> <source xml:lang="en-US">This is an extremely important sentence.</source> <target xml:lang="sv-SE">En mycket viktig mening.</target> </alt-trans> </segment> <segment> <source xml:lang="en-US"><ept id="1">]</ept></source> </segment> </trans-unit> <trans-unit id="3"> <segment> <source xml:lang="en-US">Ambiguous sentence.</source> </segment> <segment> <source xml:lang="en-US"> </source> </segment> <segment> <source xml:lang="en-US">More <bpt id="1">[LINK-to-toc:</bpt>content<ept id="1">]</ept>.</source> </segment> </trans-unit> </body>
The translator would work through this document segment by segment. Since a translation already exist for the first <trans-unit> and the first <segment> in the second <trans-unit> there is no work needed for these (unless for some reason an incorrect or unsuitable translation has been used).
The second segment may or may not need to be altered as part of the localisation. There is no translatable content here, but there is a tag and a space character. In our case we assume that this does not require any localisation efforts, and the translator can e.g. confirm this by introducing a <target> segment with content that is equal to the <source> and flagging it as translated, like this:
<segment>
<source xml:lang="en-US"> <bpt id="1">[ITALIC:</bpt></source>
<taget xml:lang="sv-SE" state="translated"> <bpt id="1">[ITALIC:</bpt></target>
</segment>
The next segment has a pretty good fuzzy match. The translation tool would present this to the translator, and he/she may choose to adopt the fuzzy match translation to fit the new source language text:
<segment>
<source xml:lang="en-US">This is an important sentence.</source>
<taget xml:lang="sv-SE" state="translated">En viktig mening.</target>
<alt-trans origin="transation memory" match-quality="80%">
<source xml:lang="en-US">This is an extremely important sentence.</source>
<target xml:lang="sv-SE">En mycket viktig mening.</target>
</alt-trans>
</segment>
The last segment in this trans-unit contains only a tag, and will be the same also in the target language:
<segment>
<source xml:lang="en-US"><ept id="1">]</ept></source>
<taget xml:lang="sv-SE" state="translated"><ept id="1">]</ept></target>
</segment>
During this process of translation the translator could e.g. also use his/her own translation memory, which may have matches for content that does not have <alt-trans> elements in the XLIFF document.
When reaching the third <trans-unit> the translator struggles with the ambiguous sentence in the first segment. After some deliberations he/she decides on a translation that likely fits the purpose, but would like to highlight the fact that this particular sentence may need additional attention during the editing and/or proof reading stages. The XLIFF translation editor application captures the translator's comments and stores them embedded in the XLIFF document together with this segment, as follows:
<segment> <source xml:lang="en-US">Ambiguous sentence.</source> <target xml:lang="sv-SE" state="needs-review-translation">Omstridd mening.</target> <note annotates="target" from="Swedish Translator">This translation may not be appropriate. Please evaluate it carefully!</note> </segment>
The rest of the document is localised in the same manner as above, and the results look like this:
<segment> <source xml:lang="en-US"> </source> <taget xml:lang="sv-SE" state="translated"> </target> </segment> <segment> <source xml:lang="en-US">More <bpt id="1">[LINK-to-toc:</bpt>content<ept id="1">]</ept>.</source> <taget xml:lang="sv-SE" state="translated">Ytterligare <bpt id="1">[LINK-to-toc:</bpt>innehåll<ept id="1">]</ept>.</target> </segment>
In the last segment the translator must place the <bpt> and <ept> elements in the correct location for the target language translation.
When the translation job is finished the file is sent back to the localisation agency for further processing.
Depending on the localisation process applied the translated XLIFF document may go through one or more of the following stages: editing, proof reading, and reviewing. Inside and outside of the localisation industry there are many different interpretations of these terms. In this document the terms are used in the following meanings:
Editing is typically performed as part of an acceptance process where the work of the translator is reviewed and adopted as necessary by editors working for the localisation agency. The editing process may or may not involve the translator (typically for implementing the suggested corrections). The editing work can be done by the localisation agency or it can be outsourced to linguistic experts.
Proof reading is a final verification step conducted by the localisation agency as part of finalising the translation job, in which the edited translation work is read in one go and double-checked for quality and consistency to ensure no errors have slipped through the process. The proof reading can be done by the localisation agency or it can be outsourced to linguistic experts.
Reviewing is done by the content owner to verify the quality of the translation as it is being handed off from the localisation agency. The content owner may perform the review themselves or they may outsource the work to independent consultants. If any corrections are suggested they would be implemented by the localisation agency, which may in turn use the translator to do that.
After translation the body of the XLIFF file in our example that is sent back to the localisation agency from the translator looks like this:
<body> <trans-unit id="1"> <source xml:lang="en-US">The Document Title</source> <target xml:lang="sv-SE" state="translated" state-qualifier="leveraged-tm">Dokumentrubriken</target> </trans-unit> <trans-unit id="2"> <segment> <source xml:lang="en-US">First sentence.</source> <target xml:lang="sv-SE" state="translated" state-qualifier="leveraged-tm">Första meningen.</target> </segment> <segment> <source xml:lang="en-US"> <bpt id="1">[ITALIC:</bpt></source> <taget xml:lang="sv-SE" state="translated"> <bpt id="1">[ITALIC:</bpt></target> </segment> <segment> <source xml:lang="en-US">This is an important sentence.</source> <taget xml:lang="sv-SE" state="translated">En viktig mening.</target> <alt-trans origin="transation memory" match-quality="80%"> <source xml:lang="en-US">This is an extremely important sentence.</source> <target xml:lang="sv-SE">En mycket viktig mening.</target> </alt-trans> </segment> <segment> <source xml:lang="en-US"><ept id="1">]</ept></source> <taget xml:lang="sv-SE" state="translated"><ept id="1">]</ept></target> </segment> </trans-unit> <trans-unit id="3"> <segment> <source xml:lang="en-US">Ambiguous sentence.</source> <target xml:lang="sv-SE" state="needs-review-translation">Omstridd mening.</target> <note annotates="target" from="Swedish Translator">This translation may not be appropriate. Please evaluate it carefully!</note> </segment> <segment> <source xml:lang="en-US"> </source> <taget xml:lang="sv-SE" state="translated"> </target> </segment> <segment> <source xml:lang="en-US">More <bpt id="1">[LINK-to-toc:</bpt>content<ept id="1">]</ept>.</source> <taget xml:lang="sv-SE" state="translated">Ytterligare <bpt id="1">[LINK-to-toc:</bpt>innehåll<ept id="1">]</ept>.</target> </segment> </trans-unit> </body>
The localisation agency contacts their Swedish language editor for editing the document.
The editor opens the document in an XLIFF aware application suitable for editing and reviewing work, and looks at the individual translations in the document and approves translations or suggests corrections as appropriate. Each approved <segment> or <trans-unit> is marked by the tool through a change of the state attribute value and corresponding change/removal of the state-qualifier attribute. The first two approved segments look like this:
<trans-unit id="1"> <source xml:lang="en-US">The Document Title</source> <target xml:lang="sv-SE" state="signed-off">Dokumentrubriken</target> </trans-unit> <trans-unit id="2"> <segment> <source xml:lang="en-US">First sentence.</source> <target xml:lang="sv-SE" state="signed-off">Första meningen.</target> </segment> ...
When the editor sees the note from the Swedish Translator about the potential issue he/she does further research into whether the segment in question is correctly translated. As a result the editor marks the segment as rejected and suggests an improved translation of that segment, and the XLIFF tool for editing/reviewing captures the suggested corrections in the XLIFF document as follows:
<segment> <source xml:lang="en-US">Ambiguous sentence.</source> <target xml:lang="sv-SE" state="rejected-inaccurate">Omstridd mening.</target> <note annotates="target" from="Swedish Translator">This translation may not be appropriate. Please evaluate it carefully!</note> <note annotates="target" from="Swedish Editor">Change to: "Tvetydig mening."</note> </segment>
The editor approves all other content in our sample document. The edited document is sent back to the localisation agency, who detects that at least one of the translations was rejected, and as a result sends the file back to the translator for implementing the corrections. (This gives the translator a chance to review the corrections and learn from his/her mistakes.)
The translator receives the file, implements the correction, and removes the corresponding notes.
<segment> <source xml:lang="en-US">Ambiguous sentence.</source> <target xml:lang="sv-SE" state="translated">Tvetydig mening.</target> </segment>
Once the updated document has been sent back to the localisation agency they may determine if it requires a second round of editing to ensure that the suggested changes have been correctly implemented. In this case the decision is made that since the number of corrections were so small any outstanding problems are likely to be caught in the proof reading.
Proof reading proceeds much like the editing, only that we expect a lot fewer corrections. When proof reading has finished and all corrections have been implemented and signed-off the following should be true:
Each <source> element in the XLIFF file should have a corresponding <target>.
Each <target> element should have the state attribute set to singed-off.
All <note> elements that annotate target segments concerning translations should normally have been addressed and removed.
The file is now ready for review, and is sent to the content owner, who either handles the review internally or outsources the job to external subject matter experts.
Depending on the type of content it may be possible to do the review work directly on the XLIFF files or it may be necessary to convert the content in the XLIFF file to its final format in order to see all context information. (The latter may be particularly important if the document contains important graphics that are referenced in the text, but do not appear in the XLIFF files.)
If the review is done on the final format the review comments will typically be supplied on a marked-up printed paper copy, or typed up in an email. In that case the XLIFF files need to be manually updated with any necessary corrections based on the reviewer's feedback.
On the other hand, if it is possible to do review work directly on the XLIFF file the reviewer may be able to use an XLIFF application to do the job. The workflow would then be similar to how
In either case, when all review comments have been implemented, the state of each <target> element should be changed to final to indicate that no more linguistic changes should be done to the document. After the review has finished the body of the XLIFF document looks like this:
<body> <trans-unit id="1"> <source xml:lang="en-US">The Document Title</source> <target xml:lang="sv-SE" state="final">Dokumentrubriken</target> </trans-unit> <trans-unit id="2"> <segment> <source xml:lang="en-US">First sentence.</source> <target xml:lang="sv-SE" state="final">Första meningen.</target> </segment> <segment> <source xml:lang="en-US"> <bpt id="1">[ITALIC:</bpt></source> <taget xml:lang="sv-SE" state="final"> <bpt id="1">[ITALIC:</bpt></target> </segment> <segment> <source xml:lang="en-US">This is an important sentence.</source> <taget xml:lang="sv-SE" state="final">En viktig mening.</target> <alt-trans origin="transation memory" match-quality="80%"> <source xml:lang="en-US">This is an extremely important sentence.</source> <target xml:lang="sv-SE">En mycket viktig mening.</target> </alt-trans> </segment> <segment> <source xml:lang="en-US"><ept id="1">]</ept></source> <taget xml:lang="sv-SE" state="final"><ept id="1">]</ept></target> </segment> </trans-unit> <trans-unit id="3"> <segment> <source xml:lang="en-US">Ambiguous sentence.</source> <target xml:lang="sv-SE" state="final">Tvetydig mening.</target> </segment> <segment> <source xml:lang="en-US"> </source> <taget xml:lang="sv-SE" state="final"> </target> </segment> <segment> <source xml:lang="en-US">More <bpt id="1">[LINK-to-toc:</bpt>content<ept id="1">]</ept>.</source> <taget xml:lang="sv-SE" state="final">Ytterligare <bpt id="1">[LINK-to-toc:</bpt>innehåll<ept id="1">]</ept>.</target> </segment> </trans-unit> </body>
This is the final version of the XLIFF file that will be delivered from the localisation agency to the content owner.
Once the content owner receives the final translated version of the XLIFF file they may want to update their central translation memory to include these translations.
Assuming that the segmentation in the XLIFF file has not been changed it should still match the segmentation used in the translation memory, so then it should simply be a matter of feeding the XLIFF file into a tool that either updates the central translation memory directly or creates a TMX file that can be used for that purpose.
This tool would simply iterate over all <source> and <target> pairs in the XLIFF file and create new or update existing translation units in the translation memory for each of them.
<to be continued>