OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [xliff] Opinion: importance of preserving XML markup and nodes

Hi Yves et al.,

I'm sorry I can't respond with more detail, but I'm leaving for Taiwan
tomorrow and need to get ready to go.

The answer about <sub> is two-fold:

(1) Some kinds of sub really do belong to the segment in question. E.g.,

<p>Press the button marked with <img src="widget.png" alt="Widget button" />
to start the widget</p>

(in which widget.png is a picture of something in a UI)

While the alt attribute could be separated into its own segment, I'm not
sure that is wise since it forms a clear part of the segment in some
environments (less clear in others). This concern was probably the most
important of the two.

(2) Some tools treat elements like footnotes as part of the segment in which
they occur. To take a common sort of example, imagine the following sort of
markup (a made up example because I'm too lazy to get real markup):

Hungarians eat goulash <ref value="All About Hungary, p. 2" /> and fish soup
<ref value="Hungarian diets, p. 54" />.

Some tools would consider this bit to be three segments but others one.
Whatever arguments can be made from a technology standpoint, the *logical*
argument about whether these should be treated as part of the segment or not
for translation isn't entirely clear to me. I would prefer it to be treated
as three segments in some ways, but in other ways that is not a satisfactory
solution since it abstracts these items from their context and creates
segments that in isolation may not be entirely interpretable.

Since there are tools that take both approaches, mandating that we not have
sub is somewhat problematic. Of course, this carries an interoperability
burden, and maybe the argument could be made that we need to put a stake in
the sand and declare that we won't support it. Maybe one option would be a
structure like the following:

Hungarians eat goulash <itag x="1" type="x-reference" sub="1" /> and fish
soup <itag x="1" type="x-reference" sub="1" />.

And then somewhere in the <tu> put something like the following:

<sub id="1">All About Hungary, p. 2</sub>
<sub id="2">Hungarian diets, p. 54</sub>

So the connection is to a separate element with a clear subordination. That
way a tool that treats this element as part of the segment could recreate
that structure while one that does not could separate them upon import.

Of course if you start with a tool that does separate them and generate the
TMX you would have separate segments, but it should be possible to
programatically add a process that examines the text being processed by the
tool to see if there are any potential subflows that might have been


> ...and I now realize that the email apparently never made it to the mailing
> list... I guess I'll have to re-write it. But the bottom
> line was: do we need <sub>? I think Arle had some concern about removing it
> because some people at OSCAR were considering it
> important. Could we have some of their technical reasons for keeping it?
> Cheers,
> -yves

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]