xliff message

Subject: RE: [xliff] Translating XLIFF 1.2

From: Yves Savourel <ysavourel@translate.com>
To: <xliff@lists.oasis-open.org>
Date: Tue, 23 Feb 2010 13:20:46 -0700

Hi Rodolfo,

OK, I've answered most of the comments below.
But this is getting old and life is too short to waste it on this type of discussion.

The 1.2 specification has clearly the intent of defining one representation for segmentation and that is <seg-source>. Even I can see that, and I'm way less smart than you are Rodolfo.

Sure, the <seg-source> model is imperfect (as your custom model), but it's what the consensus of the TC ended up being.
Sure the text could be better, but it's clear enough for several tools to have implement it and be interoperable.

For technical reason, you choose to implement it another way. The result is a valid document, but also a document not really interoperable with XLIFF tools as far as manipulating the segments.

You can try to explain how one can segment without using <seg-source> by pointing at what you see as imprecisions in the text of the specification. But none of them changes the bottom line: while <seg-source> may not be perfect it is the only way representing segmentation is described in XLIFF 1.2.

What sadden me is not that you don't agree with <seg-source> and went another way. It is that you are trying to prove that an XLIFF 1.2 documents can be seen as segmented but not use <seg-source>. It's just not true.

I would have no problem if you were saying: "I don't like <seg-source> because there are things important for my tool that I can't do with that model. So I choose to use <group>/<trans-unit> instead. And yes, I suppose, based on the 1.2 specification if I wanted my documents to show other tools that they are segmented I should use <seg-source>, but that means to have one segment per <seg-source> and that look silly to me, so I don't use it, and that means my document is not really interoperable for segmentation, and so be it."

Then we could move on to a new the discussion on getting a better segmentation representation for 2.0 :)



Now the comments (probably useless, but since I wrote them...):

>> The problem is that because such file is not following
>> the way XLIFF represents segmentation, the tool won't be 
>> able to see the different 'segments' as part of the same 
>> paragraph, and therefore won't be able to manipulate them
>> as needed (e.g. let the translators join/resplit)
>
> That's wrong. I've written tools that are able to merge/split
> segments without requiring <seg-source> or custom namespaces.

Because your tool is using its own segmentation model.

I'm sure anyone could do the same, but the problem is that a tool following the XLIFF segmentation representation cannot *know* it should see some <group> has being a paragraph in your document.


>> Segmentation is represented by <seg-source> (there is no 
>> fudging about that: I'm just reading the specification), 
>> therefore if it's not in the file, it means the file is not 
>> segmented.
>
> That's again wrong. Nothing in the specs says that if a file does 
> not use <seg-source> it is not segmented.

If interoperability was based on what is *not* described in the specification, there would be no standards :)

You cannot tell an XLIFF processor what is a segment if there is no <seg-source> because that is (for the Nth time) the only described way to represent segmentation in XLIFF 1.2.

Regardless what your <source> intends to represent, if it does not have <seg-source>: it cannot be viewed as segmented by XLIFF tools.


> Segmentation can be applied to the source text and 
> the resulting segments stored in <source> elements. 

Please, show us the text of the specification that says it is the way XLIFF represents segmentation.

 
> Where is it written in the specs that segmentation must be 
> done using <seg-source>? 
> The specification uses vague terms like "may be", 
> "can", "typically" and "generally". It does not define 
> what a segmented or unsegmented file is.

"...it may be important for the user agent to break down the content of the <source> into smaller runs of text":
That "may be" refers to applying segmentation (it's optional), not how the segmentation is represented.

"...content of the <seg-source> is generally the translatable text, typically divided into segments through the use of <mrk mtype="seg"> elements..."
Those "generally" and "typically" refer to the content of <seg-source>. They don't say that <seg-source> is generally used, or typically used and that you can use something else than <seg-source>. They refer to its content.
And, yes: it wouldn't hurt to remove those two terms.
But, no: they don't open the door for not using <seg-source>.


> You indirectly admitted, the use of <seg-source> 
> is optional. So, why do you say that applications 
> that don't use <seg-source> don't follow the 
> specification?

Not indirectly (you make me look shifty :) I said so very directly: <seg-source> is an optional element in XLIFF.
The reason why it is optional is because *representing segmentation* is optional.
Not because if you represent segmentation there is another official option than <seg-source>.


>> Using that method is just like using a custom 
>> namespace with an non-XLIFF element that stored 
>> segment boundaries: It gives us a valid file,
>> but uses a proprietary way of represent segmentation.
>
> This statement is incorrect. The use of <source>/<target> 
> is not a proprietary way of doing things.

The statement is correct: Representing segmentation with <group>/<trans-unit> is not the way described by the XLIFF specification, therefore it's a custom way, and it is as non-interoperable as using a custom namespace: other tools have to assume the text is not segmented.


> It's a way allowed in the specification. I strongly sugest
> you to read the definitions of <source> and <target>.
> Neither of them mention the word "segmentation" or 
> <seg-source>. Those elements are described as the container 
> for translatable text and translation.

Indeed: They don't talk about segmentation at all, so why are you inferring <source> can be used to represent the result of segmentation, while there is a section dedicated to that topic?


>> The bottom line is that there is only one way 
>> to represent segmentation in XLIFF, it's the <seg-source> model.
>
> There isn't only one way. There are many. 
> I already gave you an example.

A example is not a description coming from the specification.
As far as I can tell there is only one model described: <seg-source>.


> For start, there is no real need to "represent" segmentation 
> in an XLIFF file. XLIFF is a container for translatable material.
> The required basics are holders for text to be translated and 
> holders for the corresponding translation. The standard provides 
> that in <source> and <target>.
> If there is a need to represent segmentation, 
> it can be done using <group> as I explained before.

Many models *could* be used.
But the 1.2 specification choose only one: <seg-source>. So to be interoperable a tool should use that one.


> I see here a clear case of poor writing that reminds 
> me the problem with SRX that required releasing a new 
> version of the standard. The original authors failed 
> to express their ideas in writing.

Don't get me started back on SRX :) You still have not explained why the 1.0 example has duplicated rules if cascading is implicit. But it's a dead topic I don't have time to go back to.


> The intention of the people that added segmentation 
> section was clear only for the authors. It was not 
> properly described in the specification document.
> The use of <seg-source> was left as optional when 
> it should have been described as a required step 
> in the translation process. 

What is optional is to have or not a representation of a segmentation, not to use or not <seg-source> when you have to represent segmentation.

As for making it mandatory, I strongly disagree: XLIFF should allow to have un-segmented entries.
Segmentation is a complex operation, and requiring translation tools to do it would be just wrong.


> All the justifications and explanations that you 
> added in this mailing list are missing in the 
> specification document.

Actually I didn't try to justify anything.
I merely tried to explained the error of you ways :)


> The authors of the segmentation section did not 
> consider that if something is optional, those that 
> don't need it might not implement it.

I don't know who is the author(s) but I'm sure s/he/they meant exactly that: <seg-source> is optional so if segmentation is not needed the element can be not implemented.


> If representing segmentation using <seg-source> was 
> considered essential for interoperability, it should 
> have been written.
> The specification lacks a clear explanation of the 
> official workflow expected to be present in XLIFF 
> based tools.

It seems having a whole section describing it makes it important enough.

But I agree also: there are many things that could use better wording, more implementation notes in XLIFF. Hopefully we can make 2.0 much better in that respect.


> If the intention is to require the use 
> of <seg-source>, then "it may be important" is a 
> poor choice of words for the introduction of 
> Segmentation section.

You are citing out of context making look it applies to <seg-source> :)

"...it may be important for the user agent to break down the content of the <source> into smaller runs of text (for example, sentences)."

This refers to applying segmentation (the fact that it may be important for some tools), not "it may be important to use <seg-source> if you are representing segmentation".

Cheers,
-ys

Follow-Ups:
- RE: [xliff] Translating XLIFF 1.2
  - From: "Rodolfo M. Raya" <rmraya@maxprograms.com>

References:
- Translating XLIFF 1.2
  - From: "Rodolfo M. Raya" <rmraya@maxprograms.com>
- RE: [xliff] Translating XLIFF 1.2
  - From: Yves Savourel <ysavourel@translate.com>
- RE: [xliff] Translating XLIFF 1.2
  - From: "Rodolfo M. Raya" <rmraya@maxprograms.com>
- RE: [xliff] Translating XLIFF 1.2
  - From: Yves Savourel <ysavourel@translate.com>
- RE: [xliff] Translating XLIFF 1.2
  - From: Yves Savourel <ysavourel@translate.com>
- RE: [xliff] Translating XLIFF 1.2
  - From: "Rodolfo M. Raya" <rmraya@maxprograms.com>
- RE: [xliff] Translating XLIFF 1.2
  - From: Yves Savourel <ysavourel@translate.com>
- RE: [xliff] Translating XLIFF 1.2
  - From: "Rodolfo M. Raya" <rmraya@maxprograms.com>
- RE: [xliff] Translating XLIFF 1.2
  - From: Yves Savourel <ysavourel@translate.com>
- RE: [xliff] Translating XLIFF 1.2
  - From: "Rodolfo M. Raya" <rmraya@maxprograms.com>