OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Re: [xliff] XLIFF 1.0 issues - <note> as a child of <count>



Hi John,

I've tried to collate the discussion thread I started on the <count> element below. However I'm going to omit the side-discussion that was beginning around standardising the 'tool-name' attribute (I'll let someone else continue that should they wish), and Yves comments regarding the actual word count. I've added my latest thoughts at the end.

The original point:

1. <note> as a child of <count>
Currently the <count> element is very ambiguous, a note as a child
element could be used to indicate what was being counted, what was considered
a word etc.




John's first response:

<jr>The <count-group> and <count> elements can be very problematic. A
<note> element within the <count> element may help in the customized
support required by these elements but that is a human readable
approach and probably would need to be defined even more to be truly useful. A
stronger definition of the count element may do more for us.
<count> has the 'unit' attribute which has recommended values of word,
page, trans-unit, bin-unit, and item. The latter three are defined
according to elements within the spec but the former two must be
defined by the tool creating the count. I suggest that we include the tool as
an attribute to the count-group. This would be the same attribute used in
<file>, <phase>, and <alt-trans>. Further refinement of the 'unit'
attribute may alo be necessary.</jr>




Then, from Stephen Holmes:

On point 1, I'd just make the comment that the value of adding the
tool that created the wordcount as an attribute is of relatively little use
if you take a situation where, for example, "Tool X" generates the
data, but "Tool Y" reads it for processing and has different ideas about
what consitutes a word count.

It's an age old problem in localisation - "Who has the correct word
count?".  As tools may be completely proprietary, even if based on
XLIFF containers, I see no reason in complicating the attribute qualifiers.
This may become the topic of a subcommitte...





John Reid's further proposals:

On Point 1:
[Alt-jr1] The purpose for including the tool as an attribute of the
<count-group> is so that Tool Y will know that the counts it is about to
use/update are not theirs. Thus, Tool Y may want to produce its own.
However, if Tool Y is compatable with Tool X, then it can use Tool X's
counts. Meanwhile, Tool X can find its own counts and update,
accordingly.

<count-group tool="Tool X" name="example">
<count count-type="untranslated">132</count>
</count-group>

This does become complicated, though, when Tool Z is used. Tool Z may
have a compatibility issue with Tool Y but not ToolX. If Tool Y updated
Tool X's counts, Tool Z may use those inaccurately.

[Alt-jr2] There is another solution to this: We already have a <phase>
element that stores the tool used in that phase. The phase-name
attribute could be added to <count>. Thus, when that count was produced
and by what, could be ascertained by any subsequent tool and a
determination of if to use the count could be made.

<phase-group>
<phase phase-name="create" process-name="Translation" tool="Tool X"
date="2002-04-10T09:41:02Z"/>
</phase-group>
..
<count-group name="example">
<count phase-name="create" count-type="untranslated">132</count>
</count-group>

Then again, with either method, a tool has to update the attribute of a
count or count-group element and historical data is lost. Thus, adding
the phase-name or the tool attribute methods have essentially the same
consequence: Tool Z knows which tool last touched the named count. Using
a tool attribute has the advantage of being in the scope of the current
node. The phase-name has the advantage of carrying additional
information such as date.

[Alt-jr3] Another alternative is to add a new element to <count-group>,
such as <update>, that has attributes of tool and date. Thus, multiple
updates could be recorded for a <count-group>. This would need to be at
the <count-group> level since we do want to keep the contents of <count>
to the actual count.

<count-group name="example">
<update tool="Tool X" date="2002-04-10T09:41:02Z"/>
<count phase-name="create" count-type="untranslated">132</count>
</count-group>

This solution has the disadvantage that it implies an update to all the
counts within a <count-group> of which there may be many and only one
updated. This is also a weakness for adding the tool attribute to
<count-group>.

Alt-jr2 can be used to keep historical data since ther is no
restriction on the number of counts that can be stored within the
<count-group>. Thus, Tool X can supply a count in phase 1,

<phase-group>
<phase phase-name="create" process-name="Translation" tool="Tool X"
date="2002-04-10T09:41:02Z"/>
</phase-group>
..
<count-group name="example">
<count phase-name="create" count-type="untranslated">132</count>
</count-group>

Tool Y can add an update to it in phase 2,

<phase-group>
<phase phase-name="create" process-name="Translation" tool="Tool X"
date="2002-04-10T09:41:02Z"/>
<phase phase-name="translate" process-name="Translation" tool="Tool Y"
date="2002-04-11T11:43:04Z"/>
</phase-group>
..
<count-group name="example">
<count phase-name="create" count-type="untranslated">132</count>
<count phase-name="translate" count-type="untranslated">43</count>
</count-group>

and Tool Z can update Tool X's count and ignore Tool Z's in phase 3.

<phase-group>
<phase phase-name="create" process-name="Translation" tool="Tool X"
date="2002-04-10T09:41:02Z"/>
<phase phase-name="translate" process-name="Translation" tool="Tool Y"
date="2002-04-11T10:42:03Z"/>
<phase phase-name="review" process-name="Translation" tool="Tool Z"
date="2002-04-12T11:43:04Z"/>
</phase-group>
..
<count-group name="example">
<count phase-name="create" count-type="untranslated">132</count>
<count phase-name="translate" count-type="untranslated">43</count>
<count phase-name="review" count-type="untranslated">56</count>
</count-group>


Finally, my thoughts, extending from John's ideas:

I don't think that writing the tool used for a count as an attribute is suitable as this may prove inaccurate if/when the same tool re-addresses the XLIFF document after processing by others. Also, this could lead to a rapidly growing <count-group> element or the need for multiple <count-group>s as each tool processing the document writes its own information which then becomes redundant.

However I do like John's '[Alt-jr2]' proposal as I consider this to be the most usable but would like to modify this slightly!? (and the structure of XLIFF) so that the <count-group> element be moved to become a child element of phase. I see that this will overcome a number of issues such as:
1. Having to parse multiple <count> elements to detect which one is current, i.e. which one represents the document in its current state.
2. Preserving historic count data in a suitable area of the document, i.e. it will be a child of the phase
3. Remove the possibility that the count information for a specific tool be incorrect after processing by another tool
4. Prevent one tool overwriting or altering the count information of another
5. Negate the requirement to progammatically have to link the phase information to the count information.

Regards,
Mark




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC