OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [xliff] RE: How to translate text within G tags?

Title: Message
Thank you for showing this alternative idea.  I've never had a situation where I've needed to split sentences from the same element into separate trans-units.  It seems like needing to split trans-units this way would present some challenges.
As for your example, I have a concern.  I notice there are escaped tags included.  We specifically advised against this approach in section 2.4 of the HTML profile:

2.4. Including Escaped Markup

The XLIFF specification allows for marking "beginning tags" and "ending tags" (<bpt, <ept) in a way that the markup may be escaped and preserved.  This is generally seen as a way for non-XSLT based tools to abstract the markup.  However, XSLT does not parse escaped code efficiently.  Since there are efficient alternate ways to preserve the HTML code, it is not recommended to use the <bpt and <ept tags.

While the following HTML could be expressed using the <bpt and <ept tags, along with escaped HTML code:

  <i>picabo, big-air</i>,
 and <i> yard-sale</i>

like this:

  <bpt id='1-2'>&lt;i&gt;</bpt>picabo,
 big-air<ept id='1-2'>&lt;/i&gt;</ept>, and
 <bpt id='1-3'>&lt;i&gt;</bpt> yard-sale<ept id='1-3'>&lt;/i&gt;</ept>

It is recommended to use the cleaner, more XSLT-friendly approach, like this:

  <g id='n1' ctype='x-html-i'>picabo, big-air</g>,
 and <g id='n2' ctype='x-html-i'> yard-sale</g>

Thank you for your follow up,
-----Original Message-----
From: Rodolfo M. Raya [mailto:rodolfo@heartsome.net]
Sent: Monday, March 06, 2006 6:25 PM
To: Schnabel, Bryan S
Cc: ddomeny@ektron.com; xliff@lists.oasis-open.org
Subject: RE: [xliff] RE: How to translate text within G tags?

On Mon, 2006-03-06 at 16:25 -0800, bryan.s.schnabel@exgate.tek.com wrote:


Hi Doug,
I thought about this when I wrote that portion of the HTML profile.
From a philosophical view, I strongly think I bpt/ept should only be used in XLIFF files that are derived from non-markup formats (RTF, for example).
I really don't like the idea of using bpt/ept on XLIFF files derived from HTML, XHTML, or XML files.  I see "begin paired tag" and "end paired tag" as an artificial device.  It could easily lead to malformed XML on the conversion from XLIFF back to HTML.
Assuming the source file is well formed, it would be a shame to have to delimit inline elements in an artificial way.  If <g tags are defined in the spec in such a way that they are thought to be for non-translatable text, I would vote to either update the specification, or come up with a new element for identifying translatable inline elements in <target elements.
Thanks to Doug and Rodolfo for brining this issue to light,

I have my own concerns against <bpt>/<ept> in general and <g> as used in the HTML profile (although I always considered that <g> was reserved for enclosing moveable non-translatable codes only).

Consider the following HTML paragraph:

<p>Italic texts starts <i>in the middle of first sentence. Italics ends after the second sentence.</i><p>

If <g> is used to enclose italicised text, the corresponding representation would be:

   <source>Italic texts starts <g id='i1' ctype='x-html-i'>in the middle of 
   first sentence. Italics ends after the second sentence.</g></source>

and sentence segmentation is not possible at all.

Retrying with <bpt>/<ept> pairs:

   <source>Italic texts starts <bpt id="1">&lt;i&gt;</bpt>in the middle of 
   first sentence. Italics ends after the second sentence.<ept id="1">&lt;/i&gt;</ept></source>

we still have problems for splitting the text in two segments without separating the <bpt> element from its matching <ept>.

Two elements come to the rescue: <it> and <ph>

<trans-unit id="1">
   <source>Italic texts starts <it id="1" pos="open">&lt;i&gt;</it>in the middle of 
   first sentence.</source>
<trans-unit id="2">
   <source> Italics ends after the second sentence.<it id="1" pos="close">&lt;/i&gt;</it></source>
<trans-unit id="1">
   <source>Italic texts starts <ph id="1">&lt;i&gt;</ph>in the middle of 
   first sentence.</source>
<trans-unit id="2">
   <source> Italics ends after the second sentence.<ph id="1">&lt;/i&gt;</ph></source>

I prefer to use <ph> in my filters. This makes my life a lot easier.

The information in this e-mail is intended strictly for the addressee, without prejudices, as a confidential document. Should it reach you, not being the addressee, it is not to be made accessible to any other unauthorised person or copied, distributed or disclosed to any other third party as this would constitute an unlawful act under certain circumstances, unless prior approval is given for its transmission. The content of this e-mail is solely that of the sender and not necessarily that of Heartsome.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]