[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: RE: [xliff] XLIFF 1.0 issues
Hi, In the current specification, the tool attribute is free text, the 1.0 spec says that it "is used to specify the signature and version of the tool that created or modified the document". However, this mechanism is a bit loose and open to mis-use. For example, a tool may omit the version number. Including tool-name and tool-version attributes in the next version would be a better solution. Regarding a tools registry, I don't think we could limit the names to a standard list. The hope is that as many tools as possible will use this xliff format. Is it necessary to have a naming convention for tool names? A convention is too easy to ignore, I think the best solution may be to introduce another attribute, tool-company. This way a tool can be clearly defined as tool-company = ACME tool-name = Killer App tool-version = 4.0 and not in a confusing manner such as tool = ACME Killer or tool = ACME Ltd. Killer 4 or tool = ACME, Killer App or tool = ACME Ltd., Killer App 4.0 I will write this up in more detail and propose these additions to the TC for the next release of xliff. Enda -----Original Message----- From: Stephen Holmes [mailto:sholmes@novell.com] Sent: 14 April 2002 11:54 To: xliff@lists.oasis-open.org; John Reid Subject: Re: [xliff] XLIFF 1.0 issues Thanks for the information. Can I ask then, does the tool information capture the version of the tool aswell - is it just a free-text attribute? Reason: The responsibiility to produce the count in the first instance is the responsibility of the content parser. As the parser may be revised X times to address defects, add functionality etc, can we look at a standardised way of specifying the tool/parser and version? Is this XLIFF group, or some subcommitte looking at a tools registry, i.e., agreed and standard names for the tools out there or some form of guidelines for creating these Tool names? Finally, is there a plan to integrate/leverage the LISA findings on this topic? Steve. S t e p h e n H o l m e s Localisation Development Manager International Product Development Voice: +353 (1) 241 5732 Fax: +353 (1) 241 5749 Novell, Inc., THE leading provider of Net business solutions http://www.novell.com >>> John Reid <JREID@novell.com> 04/12/02 19:05 PM >>> On Point 1: [Alt-jr1] The purpose for including the tool as an attribute of the <count-group> is so that Tool Y will know that the counts it is about to use/update are not theirs. Thus, Tool Y may want to produce its own. However, if Tool Y is compatable with Tool X, then it can use Tool X's counts. Meanwhile, Tool X can find its own counts and update, accordingly. <count-group tool="Tool X" name="example"> <count count-type="untranslated">132</count> </count-group> This does become complicated, though, when Tool Z is used. Tool Z may have a compatibility issue with Tool Y but not ToolX. If Tool Y updated Tool X's counts, Tool Z may use those inaccurately. [Alt-jr2] There is another solution to this: We already have a <phase> element that stores the tool used in that phase. The phase-name attribute could be added to <count>. Thus, when that count was produced and by what, could be ascertained by any subsequent tool and a determination of if to use the count could be made. <phase-group> <phase phase-name="create" process-name="Translation" tool="Tool X" date="2002-04-10T09:41:02Z"/> </phase-group> .. <count-group name="example"> <count phase-name="create" count-type="untranslated">132</count> </count-group> Then again, with either method, a tool has to update the attribute of a count or count-group element and historical data is lost. Thus, adding the phase-name or the tool attribute methods have essentially the same consequence: Tool Z knows which tool last touched the named count. Using a tool attribute has the advantage of being in the scope of the current node. The phase-name has the advantage of carrying additional information such as date. [Alt-jr3] Another alternative is to add a new element to <count-group>, such as <update>, that has attributes of tool and date. Thus, multiple updates could be recorded for a <count-group>. This would need to be at the <count-group> level since we do want to keep the contents of <count> to the actual count. <count-group name="example"> <update tool="Tool X" date="2002-04-10T09:41:02Z"/> <count phase-name="create" count-type="untranslated">132</count> </count-group> This solution has the disadvantage that it implies an update to all the counts within a <count-group> of which there may be many and only one updated. This is also a weakness for adding the tool attribute to <count-group>. Alt-jr2 can be used to keep historical data since ther is no restriction on the number of counts that can be stored within the <count-group>. Thus, Tool X can supply a count in phase 1, <phase-group> <phase phase-name="create" process-name="Translation" tool="Tool X" date="2002-04-10T09:41:02Z"/> </phase-group> .. <count-group name="example"> <count phase-name="create" count-type="untranslated">132</count> </count-group> Tool Y can add an update to it in phase 2, <phase-group> <phase phase-name="create" process-name="Translation" tool="Tool X" date="2002-04-10T09:41:02Z"/> <phase phase-name="translate" process-name="Translation" tool="Tool Y" date="2002-04-11T11:43:04Z"/> </phase-group> .. <count-group name="example"> <count phase-name="create" count-type="untranslated">132</count> <count phase-name="translate" count-type="untranslated">43</count> </count-group> and Tool Z can update Tool X's count and ignore Tool Z's in phase 3. <phase-group> <phase phase-name="create" process-name="Translation" tool="Tool X" date="2002-04-10T09:41:02Z"/> <phase phase-name="translate" process-name="Translation" tool="Tool Y" date="2002-04-11T10:42:03Z"/> <phase phase-name="review" process-name="Translation" tool="Tool Z" date="2002-04-12T11:43:04Z"/> </phase-group> .. <count-group name="example"> <count phase-name="create" count-type="untranslated">132</count> <count phase-name="translate" count-type="untranslated">43</count> <count phase-name="review" count-type="untranslated">56</count> </count-group> Thoughts? cheers, john >>> Stephen Holmes <sholmes@novell.com> 4/11/02 4:44:58 PM >>> On point 1, I'd just make the comment that the value of adding the tool that created the wordcount as an attribute is of relatively little use if you take a situation where, for example, "Tool X" generates the data, but "Tool Y" reads it for processing and has different ideas about what consitutes a word count. It's an age old problem in localisation - "Who has the correct word count?". As tools may be completely proprietary, even if based on XLIFF containers, I see no reason in complicating the attribute qualifiers. This may become the topic of a subcommitte... On point 3 - bear in mind that localisatin/language tools that aspire to be network-based will find base64 encoded content to be monumentally large to transfer. Europe, remember, is still predominantly 56K and we all remember the hassle involved in FedEx'ing CD's to China - business reality supercedes specification. Cheers Steve. S t e p h e n H o l m e s Localisation Development Manager International Product Development Voice: +353 (1) 241 5732 Fax: +353 (1) 241 5749 Novell, Inc., THE leading provider of Net business solutions http://www.novell.com >>> John Reid <JREID@novell.com> 04/11/02 19:02 PM >>> Hi All, My comments follow Mark's, between <jr>...</jr> tags. >>> Mark Levins <mark_levins@ie.ibm.com> 4/5/02 5:59:53 AM >>> 1. <note> as a child of <count> Currently the <count> element is very ambiguous, a note as a child element could be used to indicate what was being counted, what was considered a word etc. <jr>The <count-group> and <count> elements can be very problematic. A <note> element within the <count> element may help in the customized support required by these elements but that is a human readable approach and probably would need to be defined even more to be truly useful. A stronger definition of the count element may do more for us. <count> has the 'unit' attribute which has recommended values of word, page, trans-unit, bin-unit, and item. The latter three are defined according to elements within the spec but the former two must be defined by the tool creating the count. I suggest that we include the tool as an attribute to the count-group. This would be the same attribute used in <file>, <phase>, and <alt-trans>. Further refinement of the 'unit' attribute may alo be necessary.</jr> 2. The <count-group>, <prop-group> and <context-group> elements can be used within a <group> without any other relevant child elements The 1.0 specification allows that a <group> element can contain (for example) a <count-group> without containing anything to count. I think the <group> element should be changed to contain at least one of <group>, <trans-unit> or <bin-unit>. <jr>Shouldn't this requirement be placed on the <body> also?</jr> 3. Binary elements & <internal-file> This is kind of a big one. At the moment the specification does not define the form of the content of the <internal-file> element (although there is an optional 'form' attribute). The problem is see with this is that the specification allows users place binary data directly as content - this binary content may contain the reserved XML characters < > etc which will cause parsers to choke. The CDATA section approach is also not good enough to provide a solution. My suggestion is that the content of the <internal-file> be restricted to Base64 or at least stated so. Also, the description in the spec for the <internal-file> element reads "The <internal-file> element will contain the data for the skeleton file." which is technically wrong, it may also contain data for an <bin-source> or <bin-target> element. <jr>How does CDATA fail this purpose? I wouldn't want to restrict this to just Base64; thus, requiring a conversion for both the producer and any subsequent processor that may be able to handle the original format without a problem. Additionally, wouldn't we need an attribute such as 'original-format' if we forced your conversion?</jr> 4. mime-type attribute of <bin-source> How come this attribute is omitted from the <bin-source> element? Note that it is an attribute of <bin-target> <jr>We generally put attributes for <source> and <bin-source> in the parent, <trans-unt> and <bin-unit>, respectively. The 'mime-type' attribute of the target allows a different mime-type for the target in cases where it differs from that specified from the <bin-unit>'s. Otherwise, the mime-type of the target is unnecessary.</jr> Cheers, john ---------------------------------------------------------------- To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.oasis-open.org/ob/adm.pl> ---------------------------------------------------------------- To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.oasis-open.org/ob/adm.pl> ---------------------------------------------------------------- To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.oasis-open.org/ob/adm.pl> ---------------------------------------------------------------- To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.oasis-open.org/ob/adm.pl>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC