Hi Tony,
I just thought of one more issue related
to this approach, related to Yves’ comment in the meeting today about regarding
segmentation as a process separate from filtering.
As a result of filtering (or other
processing of the XLIFF file before it reaches the segmentation tool) the
original <trans-unit> before segmentation may have any of the following
associated information:
- Attributes: approved, translate,
reformat, xml:space, datatype, ts, phase-name, restype, resname, extradata,
help-id, menu, menu-option, menu-name, coord, font, css-style, style, exstyle,
extype, maxbytes, minbytes, size-unit, maxheight, minheight, maxwidth,
minwidth, charclass.
- Elements of type <context-group>,
<count-group>, <prop-group>, <note>, <alt-trans>
(appearing after the <source> and/or <target>).
- Non-XLIFF elements.
If we need to segment this
<trans-unit> (and thus replace it with a <group> of new <trans-unit>
elements) we need to decide how to handle each of these associated pieces of
data. The following options come to my mind:
-
Transfer/copy
to each <trans-unit>
-
Transfer
to the first (or last) <trans-unit> only.
-
Associate
information with the <group>
-
Various combinations
of these
For many of these cases they only option
that seems correct/safe is to associate the information with the <group>,
which means that we would need to extend/modify the current definition of <group>
to support almost everything that a <trans-unit> supports.
Cheers,
Magnus
From: Tony Jewtushenko
[mailto:Tony.Jewtushenko@oracle.com]
Sent: Tuesday, May 25, 2004 8:14
AM
To: xliff-seg@lists.oasis-open.org
Subject: [xliff-seg] Re: XLIFF
Segmentation at trans-unit or group level
One more Con:
7. May not be suitable for resegmenting across inline tags.
Tony Jewtushenko wrote:
Summary:
In a nutshell, the concept behind this scenario is that segments can be
defined by the hard structure of trans-units, or optionally within
groups. All segmentation functionality is offloaded to the tools that generate
or process the XLIFF content, including resegmentation.
Original Discussions:
http://www.oasis-open.org/archives/xliff-seg/200404/msg00024.html
http://www.oasis-open.org/archives/xliff-seg/200404/msg00012.html
Original XLIFF file:
<trans-unit id="1">
<source xml:lang="en-US">Long
sentence. Short sentence.</source>
<target xml:lang="sv-SE"
state="final">Lång mening. Mer mening. Kort mening.</target>
</trans-unit>
Which can be segmented using a tool into the following "hard"
segments:
<trans-unit id="1.1">
<source xml:lang="en-US">Long
sentence.</source>
<target xml:lang="sv-SE"
state="translated">Lång mening. </target>
</trans-unit>
<trans-unit id="1.2">
<source xml:lang="en-US">Short
sentence.</source>
<target xml:lang="sv-SE"
state="translated">Kort mening.</target>
</trans-unit>
Or, we can create segmentation groups that contain the individual
segments, and tag them with additional metadata that identifies them as
segment-groups/segments:
<group extype="segment-group" id="1">
<<trans-unit extype="segment" id="1">
<source xml:lang="en-US">Long
sentence.</source>
<target xml:lang="sv-SE"
state="translated">Lång mening.</target>
</trans-unit>
<trans-unit extype="segment-group" id="2">
<source xml:lang="en-US">Short
sentence.</source>
<target xml:lang="sv-SE"
state="translated">Kort mening.</target>
</trans-unit>
</group>
When reconciling with the skeleton file, presumably when building out the
translated file, a tool would need to match up the segment-group with the
original segment it replaces. Similar reconciliation would be required if
updating the TM.
PRO's:
- Uses existing XLIFF structures: no
additional changes to XLIFF specification is required.
- For most use cases, very simple to
implement.
CON's:
- Matching up resegmented data becomes tool
dependent, and may be handled differently by individual tool
providers.
- May not be suitable or very complicated for
moving segments across groups (ie, resegmenting the segment-groups).
- May not be suitable for processing target only
changes returned by vendor.
- <>Implementation is non-normative,
and would require adherence to profile rather than XSD validation.
- Complicate process of building out translated content
using skeleton files.
- Multiple iterations of resegmentation may create
tangle of groups.
--
Tony Jewtushenko
Principal Product Manager - Oracle Application Development Tools
Oracle Corporation, Ireland
mailto:tony.jewtushenko@oracle.com
Direct tel: +353.1.8039080
--
Tony Jewtushenko
Principal Product Manager - Oracle Application Development Tools
Oracle Corporation, Ireland
mailto:tony.jewtushenko@oracle.com
Direct tel: +353.1.8039080