OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] Segmentation as core or not


Hi all,

I think we might be putting the cart before the horse. I think David W. and Christian (among others) have an action item to come up with criteria for determining if a proposed or accepted feature is core vs. extended module. Perhaps we should wait until we have a more mature discussion on what criteria we should use, before we try to determine if this feature is core or not. But by all means, continue the technical discussion on this feature. Just thinking out loud here.

- Bryan
________________________________________
From: xliff@lists.oasis-open.org [xliff@lists.oasis-open.org] On Behalf Of Yves Savourel [ysavourel@enlaso.com]
Sent: Tuesday, November 01, 2011 8:01 PM
To: 'Helena S Chapman'
Cc: xliff@lists.oasis-open.org
Subject: RE: [xliff] Segmentation as core or not

Hi Helena,

I guess theoretically it would be possible to have an entire chapter in one “part”. But the extraction tools would not likely do that. Even when there is no sentence-based segmentation the extractors do break down the content into much smaller parts; typically the equivalent of paragraphs for document-type files, or strings for UI-type file.

Actually quite a few tools, especially for software, don’t go beyond that type of segmentation. If you look at many tools for PO files, or Java properties files for examples: Their entries are not often sentence-segmented. And they create TMX files where the entries are called “segments”.

Others may correct me, but I think calling those extracted parts “segments” is simply a relatively common practice.

Personally I think the important thing is to be very clear on what those “part” are, regardless how we end up calling the elements. That said we should obviously pick a name that is not too confusing.
It seems “segment” has been used for a while to mean both the container of something un-segmented and segmented (see for example TMX’s <seg>), but maybe I’ve been too deep in TMX/XLIFF/etc. for too long to see the world with un-tainted eyes :)

Hope this helps,
-yves


From: Helena S Chapman [mailto:hchapman@us.ibm.com]
Sent: Tuesday, November 01, 2011 7:52 PM
To: Yves Savourel
Cc: xliff@lists.oasis-open.org
Subject: Re: [xliff] Segmentation as core or not

Yves, I want to make sure I understand your view point. Based on what you suggested, it is possible for one to have an entire chapter or book as a single *part* when pass it around in an XLIFF file? If so, why call it a segment?

<unit id='1'>
<part>
 <source>Sentence one. Sentence two. Sentence three. .... Sentence two thousand and forty five.</source>
</part>
</unit>

Best regards,

Helena Shih Chapman
Globalization Technologies and Architecture
+1-720-396-6323 or T/L 938-6323
Waltham, Massachusetts




From:        Yves Savourel <ysavourel@enlaso.com<mailto:ysavourel@enlaso.com>>
To:        <xliff@lists.oasis-open.org<mailto:xliff@lists.oasis-open.org>>
Date:        11/01/2011 04:56 PM
Subject:        [xliff] Segmentation as core or not
Sent by:        <xliff@lists.oasis-open.org<mailto:xliff@lists.oasis-open.org>>
________________________________



Hi all,

To continue on the discussion whether the "segmentation" feature is core or not:

I think Dave has an obviously valid point when saying that segmentation is not necessarily done at the time of the extraction, and therefore we could have un-segmented XLIFF.

But to me a "segment" is not necessarily the result of a segmentation process it can be a "block" extracted from the original format (as our definition states: http://wiki.oasis-open.org/xliff/OneContentModel#Definitions.2BAC8-Terminology).
So each un-segmented entry is, by nature a segment, that simply contains potentially several sentences.

Maybe things would more clear if we think about the element <segment> as a "part" rather than a "segment"? The Segmentation representation addresses how to organize and manipulate such parts.

<unit id='1'>
<part>
 <source>Sentence one. Sentence two.</source>
</part>
</unit>

<unit id='1'>
<part>
 <source>Sentence one. </source>
</part>
<part>
 <source> Sentence two.</source>
</part>
</unit>

Maybe, viewed from that angle it's more clear that such element needs to be part of the core?

Cheers,
-ys



---------------------------------------------------------------------
To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org<mailto:xliff-unsubscribe@lists.oasis-open.org>
For additional commands, e-mail: xliff-help@lists.oasis-open.org<mailto:xliff-help@lists.oasis-open.org>




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]