OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [xliff] Segmentation as core or not

Hi Helena,


I guess theoretically it would be possible to have an entire chapter in one “part”. But the extraction tools would not likely do that. Even when there is no sentence-based segmentation the extractors do break down the content into much smaller parts; typically the equivalent of paragraphs for document-type files, or strings for UI-type file.


Actually quite a few tools, especially for software, don’t go beyond that type of segmentation. If you look at many tools for PO files, or Java properties files for examples: Their entries are not often sentence-segmented. And they create TMX files where the entries are called “segments”.


Others may correct me, but I think calling those extracted parts “segments” is simply a relatively common practice.


Personally I think the important thing is to be very clear on what those “part” are, regardless how we end up calling the elements. That said we should obviously pick a name that is not too confusing.

It seems “segment” has been used for a while to mean both the container of something un-segmented and segmented (see for example TMX’s <seg>), but maybe I’ve been too deep in TMX/XLIFF/etc. for too long to see the world with un-tainted eyes :)


Hope this helps,




From: Helena S Chapman [mailto:hchapman@us.ibm.com]
Sent: Tuesday, November 01, 2011 7:52 PM
To: Yves Savourel
Cc: xliff@lists.oasis-open.org
Subject: Re: [xliff] Segmentation as core or not


Yves, I want to make sure I understand your view point. Based on what you suggested, it is possible for one to have an entire chapter or book as a single *part* when pass it around in an XLIFF file? If so, why call it a segment?

<unit id='1'>
 <source>Sentence one. Sentence two. Sentence three. .... Sentence two thousand and forty five.</source>

Best regards,

Helena Shih Chapman
Globalization Technologies and Architecture
+1-720-396-6323 or T/L 938-6323
Waltham, Massachusetts

From:        Yves Savourel <ysavourel@enlaso.com>
To:        <xliff@lists.oasis-open.org>
Date:        11/01/2011 04:56 PM
Subject:        [xliff] Segmentation as core or not
Sent by:        <xliff@lists.oasis-open.org>

Hi all,

To continue on the discussion whether the "segmentation" feature is core or not:

I think Dave has an obviously valid point when saying that segmentation is not necessarily done at the time of the extraction, and therefore we could have un-segmented XLIFF.

But to me a "segment" is not necessarily the result of a segmentation process it can be a "block" extracted from the original format (as our definition states:
So each un-segmented entry is, by nature a segment, that simply contains potentially several sentences.

Maybe things would more clear if we think about the element <segment> as a "part" rather than a "segment"? The Segmentation representation addresses how to organize and manipulate such parts.

<unit id='1'>
 <source>Sentence one. Sentence two.</source>

<unit id='1'>
 <source>Sentence one. </source>
 <source> Sentence two.</source>

Maybe, viewed from that angle it's more clear that such element needs to be part of the core?


To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: xliff-help@lists.oasis-open.org

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]