OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: SPAM - RE: [xliff] HTML Inline codes

At Wed, 20 Oct 2004 14:54:41 -0600,
Yves Savourel wrote:
> The question is:
> Which elements should remain inside the extracted text.

This is not to answer to the question, but rather to offer a view to
the subject from a little different angle.

1. <br> can be inline or not, depending on a context. It can be used
   as a formatter or a paragraph separater. It needs to be treated
   case by case basis.

2. There is no doubt that <b> should be inline, but not always. If <b>
   appears like this,

   <p><b>Sample text.</b></p>

   <b> had better be excluded from the text. One reason is that
   translators generally don't want to be distracted by codes. The
   less codes, the happier they are. Another reason is that one can
   expects better TM levearge results without codes. If other
   documents have a sentence like <i>Sample text.</i>, a TM result
   would be 100% if both the TM record and the source segment don't
   include codes. With codes, a match result would generally be

3. So inline elements should be excluded from the segments when they
   are outside of the segments, right?  Well, it's not always
   possible.  Let's look at an example below.

   <p><b>Sample text.</b> Another text.</p>

   Whether </b> can be excluded from the segments depends on the
   implementation of the extraction framework. If the extractor
   extracts a segment as a unit and put anything between in the
   skeleton, </b> can be excluded. If the extractor extracts a
   paragraph as a unit, then </b> cannot be excluded.

These are implementation issues we came across. Other tool venders may
have different issues, too. It would be great if these issues are
captured in some way in the profile.

Shigemichi Yazawa

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]