xliff-comment message

Subject: Feedback from OpenTM2 XLIFF interoperability enhancements

From: Andrew Pimlott <andrew.pimlott@welocalize.com>
To: "xliff-comment@lists.oasis-open.org" <xliff-comment@lists.oasis-open.org>
Date: Tue, 10 May 2011 00:24:53 +0100

I have done some work to enhance the XLIFF support in OpenTM2. The use case we had in mind was a translator downloading a job as
XLIFF from GlobalSight, translating the job in OpenTM2, then uploading back to GlobalSight. (It will be checked into OpenTM2 subversion shortly, and I can give anyone interested more details and possibly a demo.) Here is some feedback from this experience:

I ran into two big issues. One is the interpretation of inline markup; the other is the question of what changes I can and should make to the document when saving.

Regarding inline markup, most of the committee discussion seems to be about the representation of different types of markup. The problem I had is more fundamental. I was happy (for my "version 1") to simply display every bit of markup to the translator as a meaningless "black box". However, even that is problematic, because there is no way in XLIFF to correlate markup in the source with markup in the target. I found this issue in the OneContentModel/Requirements wiki page as 4.10, "Should be able to associate the same codes between the source and target segments" (http://wiki.oasis-open.org/xliff/OneContentModel/Requirements?action=recall&rev=38#Shouldbeabletoassociatethesamecodesbetweenthesourceandthetargetsegments). The correlation may not always exist, but if it does, it should be denoted explictly, so you can say "this black box in the source goes with that black box in the target".

(As an aside, I think the model of treating markup as black boxes should get more attention, as it is good enough for many purposes and avoids a lot of complexity.)

Inline markup is also involved in the question of what changes I can make to the XLIFF document. The OneContentModel/Requirements brings up part of the issue as 3.1.4, "Content Manipulation" (http://wiki.oasis-open.org/xliff/OneContentModel/Requirements?action=recall&rev=38#ContentManipulation):

1. Should the specification define how the inline-content (and block level) model can be manipulated, including:
1. indicate when a code can be deleted or not, can be cloned or not,
2. indicate if a code can be moved out of sequence or not;

There is even more to it: 3. can I change the content or attributes of a markup tag?; 4. can I copy markup tags straight from an alt-trans?; and more.

It's not just about markup. The XLIFF2.0/FeatureTracking wiki page brings it up in 2.4, "Permission Control and Validation". Though I would cast it less in term of permissions and more in terms of valid operations. After translation, am I allowed to set the state to anything besides "translated"? Is it valid to change the target of a trans-unit that was in state "final"? What about a trans-unit with translate="no"?

Just as important as what can I add is, what can I take away? What are my responsibilites as far as preserving data, including extensions? Do I always have to preserve element ordering (this can be a pain!)?

I would emphasize that answering these questions is not a feature, it is essential to interoperability with a good user experience. The basic question is, how do I know the changes I make will be accepted, and understood as expected, by the next tool? If I can't ensure this, users may get errors or unexpected behavior down the line. (Also, lacking guidance from the spec, each implementor is forced to consider every case and make a guess as to what will be most acceptible. This is a huge burden and causes implementors to give up on the spec or turn in shoddy implementations.)

There is one other trifling issue that causes great lossage in practice: The value for the "match-quality" alt-trans attribute is not specified. Defining it as a whole number in [0, 100] would be good enough for me.

Finally, a more "philosophical" point. I think it would be really valuable if an XLIFF document could say what it is for. In my case, the document is for translation in a workbench. Based different values of "for", the XLIFF spec could be more specific about how the document should be interpreted, and what transformations are allowed. For example, if the document is for translation, I probably shouldn't touch trans-units with state "needs-review-translation". I think this would help the spec become more precise, as well as allowing a tool to provide a more tailored UI (or pop up a message saying, "this tool doesn't support the operation expected by this XLIFF document").

Andrew

Follow-Ups:
- RE: Feedback from OpenTM2 XLIFF interoperability enhancements
  - From: Andrew Pimlott <andrew.pimlott@welocalize.com>
- RE: [xliff-comment] Feedback from OpenTM2 XLIFF interoperability enhancements
  - From: "Yves Savourel" <yves@opentag.com>