RE: [xliff] Suggested additional changes for XLIFF 1.2

Hi Bryan,

Thank you for your explanations. Please find my comments to yours for item 4 below:

Best regards,

Magnus

From: bryan.s.schnabel@exgate.tek.com [mailto:bryan.s.schnabel@exgate.tek.com]
Sent: Friday, September 23, 2005 12:02 PM
To: Magnus Martikainen; xliff@lists.oasis-open.org; tony.jewtushenko@productinnovator.com
Subject: RE: [xliff] Suggested additional changes for XLIFF 1.2

Sure Magnus,

I'm happy to provide some reasons. I'll try to keep this note "light" and not too loaded up with code examples (no promises, though).

As for item 4, I think the whole idea of dealing with translatable attributes is a tricky thing. As an XML purist who grew up as an SGML purist, my instinct is to declare that translatable strings should be designed to be element content, not attribute content. But in the real world, we cannot always do that (html's <img alt="some other caption", for example, is a good use of an attribute that efficiently contains a translatable string).

There are other examples of stings in attributes that are very poorly designed, but exist in the real world, like this:

<Ad>

<announce>The cost for the

<item label="lift ticket" season="spring" /> is

</announce>

</Ad>

In the technical publications world we see more badly designed XML than one would expect. The <sub element is very nice in this kind of a case because it lets us mark up the XLIFF in such a way that the content can be (kind of) normalized in so that all of the pertinent text is in the same <trans-unit. It is therefore not too difficult to map this efficiently to enable a nice roundtrip via XSLT. I would also argue that since it's in one <trans-unit, it is easier on the translator because it is a bit more cohesive than the approach I would envision if I had to use the xid method. Making the translator jump from one <trans-unit to another, keeping the hierarchy and flow straight, I think, is less friendly.

But to be fair, I have not actually coded up the xid approach, so maybe I'm not seeing a better xid way. I'm open to consider it if a way exists.

[Magnus] This is an interesting example. From a localization perspective extracting the translatable attribute content into separate standalone units would probably make it easier to work with, and definitely would improve recycling, as the individual components could be re-used independently.

“The cost for the <ph/> is <ph/>.” (attributes and element content omitted for clarity) looks easy to translate and is likely a good re-use candidate. And proper use of context for the standalone attribute values should give the translator enough information.

On the other hand, translating “The cost for the <ph>lift ticketspring</ph> is <ph>dollarCanadian175</ph>.” (attributes and element content omitted for clarity) actually looks a lot more complex, at least to me.

In the end the “translatability” is to a large extent a matter of taste here and it is probably possible to find specific examples that “prove” that either representation is the better. The re-usability of existing translations however is definitely greater when using xid.

Regarding the ability to do round trips using XSLT transformation I would guess (though I have not tried) that you could also achieve this if translatable attribute content is stored in a different <trans-unit>, as the xid can be used to build XPATH expressions to locate the content. But of course it may be a bit more complex.

As for item 5, my reasons are nearly the reverse of the reasons for item 4. The identification of a resource's type or context, or name seems to me to be metadata, and not data. I am most comfortable dealing with that kind of information as an attribute value. I agree with you that the restype values in the current specification are highly file format specific. That's kind of bugged me for a while, but I haven't raised it as an issue. I kind of think that it should be all-or-nothing. And since we can never know all the resource types for all the given resources in the world, my preference would be to just make the attribute type "string", rather than an enumerated list. But for me, for now, the "x-" extension works out just fine.

So much for my "light" explanations. Many of your bullets under items 4 and 5 make sense to me. Even though I'm mostly against these changes, I think you make good points, and I'd be happy to participate in a larger discussion.

Thanks for bringing these thoughts to light,

Bryan

-----Original Message-----
From: Magnus Martikainen [mailto:Magnus@trados.com]
Sent: Friday, September 23, 2005 9:58 AM
To: Schnabel, Bryan S; xliff@lists.oasis-open.org; tony.jewtushenko@productinnovator.com
Subject: RE: [xliff] Suggested additional changes for XLIFF 1.2

Hi Bryan,

Thank you for your input. I am definitely interested in hearing your opinions. Would you mind providing the reasons why you disagree with 4 and 5?

Thanks,

Magnus

From: bryan.s.schnabel@exgate.tek.com [mailto:bryan.s.schnabel@exgate.tek.com]
Sent: Friday, September 23, 2005 9:50 AM
To: Magnus Martikainen; xliff@lists.oasis-open.org; tony.jewtushenko@productinnovator.com
Subject: RE: [xliff] Suggested additional changes for XLIFF 1.2

Hi Magnus,

These are all thought provoking.

I agree with 1, 2, and 3.

I disagree with 4.

And I strongly disagree with 5.

Perhaps these points are worthy of some debate. I have my opinions, but I'm willing to hear more.

Thanks,

Bryan

-----Original Message-----
From: Magnus Martikainen [mailto:Magnus@trados.com]
Sent: Tuesday, September 20, 2005 10:51 AM
To: xliff@lists.oasis-open.org; Tony Jewtushenko
Subject: [xliff] Suggested additional changes for XLIFF 1.2

Hi Tony etc.,

Following up on the conference call today, here are my suggestions for additional changes of the XLIFF 1.2 specification:

1) The following attributes seem to be used only for file format specific purposes. I propose deprecating them, as we should rather use the extension mechanism for such things:

extradata

menu

menu-option

menu-name

coord

css-style

style

exstyle

extype

In addition to this I have the following proposals:

2) The name attribute for <context-group> is currently marked as REQUIRED. (The specification indicates that this attribute should only be used for processing instructions.) I propose to relax this rule and allow the name attribute for a <context-group> to be optional.

3) I propose to add an extension point at the <xliff> level. This would be useful for providing information that is common to all the files in the XLIFF document. Some examples of what this could be used for:

Job / project information

Contact details

Tool settings

Copyright information

General instructions,

Embedded style guides

Analysis reports

Pricing information

4) I propose to deprecate the element in favour of using the xid attribute and putting the embedded translatable content inside a separate <trans-unit>. Reasons:

The two mechanisms are used for the same purpose.

The xid mechanism is superior to the mechanism.

The element content disrupts the text flow. It requires tricky special processing by XLIFF compliant tools, and can also be distractive during translation. Removing it will make it easier to create good XLIFF compliant tools.

The content of the is a separate unit for translation, and as such is much better represented in its own <trans-unit>.

More context information can be provided for a separate <trans-unit>. E.g. length restrictions for the embedded content can be expressed.

Keeping the embedded translatable content in a separate <trans-unit> will yield better recycling. Both the surrounding text and the embedded content can be treated as separate units. One can be re-used if the other changes, and if the same embedded content appears in different text it can be re-used independently.

5) I propose to deprecate the restype attribute in favour of using the <context> element to provide context information. Reasons:

The two mechanisms serve a similar purpose and can be used for the same thing. Providing two ways to do the same thing makes interoperability more difficult.

The <context> mechanism is more powerful than the restype attribute

The restype values in the current specification are highly file format specific.

If multiple contexts apply it is not clear if one can specify more than one value for a restype attribute.

Best regards,

Magnus Martikainen

xliff message