OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: [OASIS Issue Tracker] (XLIFF-23) Discussion of different methods to handle ITS Translate in XLIFF

David Filip created XLIFF-23:

             Summary: Discussion of different methods to handle ITS Translate in XLIFF
                 Key: XLIFF-23
                 URL: https://issues.oasis-open.org/browse/XLIFF-23
             Project: OASIS XML Localisation Interchange File Format (XLIFF) TC
          Issue Type: Improvement
          Components: ITS Module
    Affects Versions: 2.1_csprd02
         Environment: http://markmail.org/thread/yhhz7tqsh4iz3hgv
            Reporter: David Filip
            Assignee: David Filip
             Fix For: 2.1_cs01

text protection using native codes and <mrk translate='no'>

This is a specific comment about one facet of the ITS module, although I
think there is a more general version of the same argument that can be made
about some of the other examples.

Section talks about using the ITS translate attribute to control
translation of content surrounded by inline codes in the source.  The
example given is this one:

<p>Text <code translate='no'>Code</code></p>

The section outlines two methods for representing this as XLIFF <source>.
The first method is to represent the native <code> tag as an inline code,
and use a <mrk> to handle the ITS:

<source>Text <pc id='1'/><mrk id='m1' translate='no'>Code</mrk></pc></source>

The second method is to include the <code> element's text child in the
XLIFF code:

<source>Text <ph id='1'/></source>

A note occurring earlier in the document (section, discussing the
<ph> element) mentions that this second method is possible, but discouraged:

It is possible although not advised to use <ph>
mask non translatable inline content. The preferred way of protecting
portions of inline content from translation is the *Core* Translate
See also discussion in the ITS Module section on representing
translatability inline.

In practice, I expect the discouraged second method to be the more common
method of implementations.  There are two reasons for this.  First, it is
simpler to implement, due to its simpler structure.

More importantly, it is closer to representing the meaning of the original
source. The recommended structure separates the representation of the
source markup as syntax from its meaning in the text, by pulling the ITS
data out into a separate element.  As a result, the meaning of the source
markup ("the contents of the <code> element are not translatable") is not
preserved.  As far as I can tell (and hopefully I'm not missing something),
there is no mechanism that prevents a Modifier from inserting additional
text between the inline code and mrk tags, like this:

<target>Translation <pc id='1'/>*additional text *<mrk id='m1'
translate='no'>Code</mrk> *more additional text*</pc></target>

Or from rearranging these things entirely, like this:

<target>Translation <pc id='1'/>*additional text*</pc> <mrk id='m1'

Both of these produce targets that circumvent the intention of the
its:translate attribute in the native source content.  The original source
text is still protected, but the contents of the overall <code> tag are
mutable, unless the merger takes additional steps beyond the specification
(ie, inferring that the <pc> and <mrk> tags are bound somehow, and
discarding sibling elements of the <mrk> that appear in the target and not
the source).

This sort of subversion is probably unlikely, but in my experience
implementors will go to some lengths to avoid accidents from happening
later on, and so might opt for the discouraged <ph/> representation, in
which the non-translatable region is truly off-limits.

This message was sent by Atlassian JIRA

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]