OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] Segmentation as core or not


It almost read like what the localization industry is used to call "segment" is really a "partition". Basically something that have been cut, classified but could be further divided or broken off into finer fragments? Since I have only been involved in localization topic for the last 3-4 years, I am probably close to the un-tainted eyes.

To me, a segment in the localization world is something that usually have something to do with payment. That is, even if one is paying a service by words, the cost of each word can still be determined by the complexity of a segment. (e.g. length etc.)




From:        Yves Savourel <ysavourel@enlaso.com>
To:        Helena S Chapman/San Jose/IBM@IBMUS
Cc:        <xliff@lists.oasis-open.org>
Date:        11/01/2011 11:02 PM
Subject:        RE: [xliff] Segmentation as core or not




Hi Helena,
 
I guess theoretically it would be possible to have an entire chapter in one “part”. But the extraction tools would not likely do that. Even when there is no sentence-based segmentation the extractors do break down the content into much smaller parts; typically the equivalent of paragraphs for document-type files, or strings for UI-type file.
 
Actually quite a few tools, especially for software, don’t go beyond that type of segmentation. If you look at many tools for PO files, or Java properties files for examples: Their entries are not often sentence-segmented. And they create TMX files where the entries are called “segments”.
 
Others may correct me, but I think calling those extracted parts “segments” is simply a relatively common practice.
 
Personally I think the important thing is to be very clear on what those “part” are, regardless how we end up calling the elements. That said we should obviously pick a name that is not too confusing.
It seems “segment” has been used for a while to mean both the container of something un-segmented and segmented (see for example TMX’s <seg>), but maybe I’ve been too deep in TMX/XLIFF/etc. for too long to see the world with un-tainted eyes :)
 
Hope this helps,
-yves
 
 
From: Helena S Chapman [mailto:hchapman@us.ibm.com]
Sent:
Tuesday, November 01, 2011 7:52 PM
To:
Yves Savourel
Cc:
xliff@lists.oasis-open.org
Subject:
Re: [xliff] Segmentation as core or not

 
Yves, I want to make sure I understand your view point. Based on what you suggested, it is possible for one to have an entire chapter or book as a single *part* when pass it around in an XLIFF file? If so, why call it a segment?

<unit id='1'>
<part>
<source>Sentence one. Sentence two. Sentence three. .... Sentence two thousand and forty five.</source>
</part>
</unit>


Best regards,

Helena Shih Chapman
Globalization Technologies and Architecture
+1-720-396-6323 or T/L 938-6323
Waltham, Massachusetts





From:        
Yves Savourel <ysavourel@enlaso.com>
To:        
<xliff@lists.oasis-open.org>
Date:        
11/01/2011 04:56 PM
Subject:        
[xliff] Segmentation as core or not
Sent by:        
<xliff@lists.oasis-open.org>






Hi all,

To continue on the discussion whether the "segmentation" feature is core or not:

I think Dave has an obviously valid point when saying that segmentation is not necessarily done at the time of the extraction, and therefore we could have un-segmented XLIFF.

But to me a "segment" is not necessarily the result of a segmentation process it can be a "block" extracted from the original format (as our definition states:
http://wiki.oasis-open.org/xliff/OneContentModel#Definitions.2BAC8-Terminology).
So each un-segmented entry is, by nature a segment, that simply contains potentially several sentences.

Maybe things would more clear if we think about the element <segment> as a "part" rather than a "segment"? The Segmentation representation addresses how to organize and manipulate such parts.

<unit id='1'>
<part>
<source>Sentence one. Sentence two.</source>
</part>
</unit>

<unit id='1'>
<part>
<source>Sentence one. </source>
</part>
<part>
<source> Sentence two.</source>
</part>
</unit>

Maybe, viewed from that angle it's more clear that such element needs to be part of the core?

Cheers,
-ys



---------------------------------------------------------------------
To unsubscribe, e-mail:
xliff-unsubscribe@lists.oasis-open.org
For additional commands, e-mail:
xliff-help@lists.oasis-open.org



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]