OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: RE: [xliff] XLIFF inline tags

Hi David and all,


As I understand, the <bx/> and <ex/> elements are used to replace <g> and
</g> when paired codes do not follow XML well-formedness rules (i.e. no
overlapping elements). If the paired codes follow that rule, it is strongly
recommended that the <g> element is used because it simplifies processing.

[My personal opinion on that is that the original file should be fixed in
those cases, and therefore we shouldn't need <bx/> and <ex/>].


You are very correct about the discripancies between the <g>/<x> forms and
the <bpt>/<ept>/<ph>/<it> forms. It seems they should have the same
functionality as far as rid, pos, assoc, and clone.

I think crc not in <g> and <x> is ok. My understanding of crc is that it's
used to get the checksum of the code content, to verify whether it has been
modified or not. Therefore since <g> and <x/> have no code content (their
code parts is either in the skeleton or re-created), they don't need crc.


For source to target mapping: I would expect that id values cannot be
changed. The elements could be moded yes (and actually maybe we should have
a 'move' attribute like we have a 'clone'), but not changed.

The more I look at rid, the more I think it has the role the new xid
attribute is having (this attribute comes from OpenTag, and there too it
seems clearly to have the role xid has). The only case where the 1.0
specification mention rid to match elements within a segment is for <bx/>
and <ex/> and I'm guessing it rid was chosen because it was there. The
specification says: "These paired elements are related via their rid
attributes" which seem to hint that both <ex/> and <bx/> with the same rid
would be linked. I see a number of issues with this:

a) The rid is only optional in those elements, shouldn't it be mandatory if
it has such important function?

b) The logical way to use rid (to me) would be to refer to an id, not
another rid.

c) Another illogical aspect is that we match <bpt> and <ept> with id. Why
using rid for <bx/> and <ex/>

I think a lot of the problems we are having in inline elements come from the
fact that we have not worked on enough outputs using those tags.

That was my 2 cents for the day,

-----Original Message-----
From: David Pooley [mailto:DPooley@sdlintl.com]
Sent: Thu, February 20, 2003 5:20 AM
To: xliff@lists.oasis-open.org
Subject: [xliff] XLIFF inline tags

As a tools developer, I'm a little concerned over the functionality of the
inline tags available in XLIFF. Sleep deprivation (thank you son) means that
I may forget later so I thought that I'd raise this now. I apologise that
this has come so late in the day but it was only the recent discussion on
the <bin-unit> problem that prompted me to re-visit these tags. I accept
that this is tool late for V1.1 but perhaps we can discuss this for the next
version ...

First of all, I'm not clear why <bx/> and <ex/> are required. As far as I
can make out, they are functionally equivalent to <bpt></bpt> and
<ept></ept> (i.e. empty elements). John may be able to set me straight on
this. If this is the case then at least I'll know what to do with them in my
parser. However, having extra elements for the sake of it doesn't make my
life any easier.

Secondly, I don't think that I'm given sufficient freedom of choice for my
mark-up style. I believe that <g> and <bpt> are supposed to be functionally
equivalent. However, <g> gives me the ability to "clone" whereas <bpt> gives
me the option of an "rid" and a "crc". A similar case applies to <x/> and
<ph>; I can "clone" an <x/> but <ph> lets me specify a "crc" and an "assoc".
In fact, <x/> also has to do the job of <it> but doesn't have the option of
a "pos" attribute (it would also need an attribute to indicate what role
it's playing; <ph> or <it>).

Thirdly, I'd also be interested to hear views on sub-sentence alignment and
the role of the "id" and "rid" attributes. Take, for example, the following
snippet of HTML and it's translation in to French:

<p>The <b>black</b> <i>cat</i></p>
<p>Le <i>chat</i> <b>noir</b></p>

When filtered in to XLIFF, we have the following possibilities (depending on
mark-up style).

<source>The <g id="1" ctype="bold">black</g> <g id="2"

<source>The <bpt id="1" ctype="bold">&lt;b&gt;</bpt>black<ept
id="1">&lt;/b&gt;</ept> <bpt id="2" ctype="italic">&lt;i&gt;</bpt>cat<ept

Is there than an implicit requirement to match up the formatting in the
target or should we be looking to use the "rid" attribute to do this? SDLX
traditionally matches up the <g> tags by "id" but, in TMX (the origin of
<bpt> et al), there's no requirement to do this and there is available an
"x" attribute to achieve this goal.

My logic would suggest that the following translations are correct:

<target>Le <g id="2" ctype="italic">chat</g> <g id="1"

<target>Le <bpt id="2" ctype="italic">&lt;i&gt;</bpt>chat<ept
id="2">&lt;/i&gt;</ept> <bpt id="1" ctype="bold">&lt;b&gt;</bpt>noir<ept

However, are there any restrictions in XLIFF to prevent me from having the
following as translations? (Note that "id" numbers)

<target>Le <g id="1" ctype="italic">chat</g> <g id="2"

<target>Le <bpt id="1" ctype="italic">&lt;i&gt;</bpt>chat<ept
id="1">&lt;/i&gt;</ept> <bpt id="2" ctype="bold">&lt;b&gt;</bpt>noir<ept

I thought that perhaps this was what the "rid" attribute was for (to match
up the formatting between source and target). I'd rather avoid having to
deal with the second situation in my parsing, as I then need to work out
exactly which source formatting corresponds to which target formatting. In
the case above, I could possibly use the "ctype" to match them up but that's
not a mandatory attribute so it's not always possible. I also don't relish
the prospect of fetching out the content of <bpt> and trying to use that!

As I said, I don't expect anything to get resolved for V1.1. However, I'd
like to see some clarification of inline mark-up in the next version (V1.2 /
V2.0?). The currently available options are adequate for my own filtering
(I'll stick to the OpenTag tags that I know and love) but I'd really like to
provide an XLIFF editing environment for files that come from other sources
and I don't feel that the current specification is helping as much as it

David Pooley
Software Architect
SDL International

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept
for the presence of computer viruses.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC