xliff-inline message

Subject: Teleconference - Sep-14-2010 - 13:30 UTC - Summary
From: Yves Savourel <ysavourel@translate.com>
To: <xliff-inline@lists.oasis-open.org>
Date: Tue, 14 Sep 2010 08:59:15 -0600
XLIFF Inline Markup Subcommittee Teleconference Summary

=== 1) Administration

Attending: Andrew, Milan, Arle, Yves, Lucia, Dimitra
Regrets: Asgeir



=== 2) Discussion: Requirements

Requirement working page:
http://wiki.oasis-open.org/xliff/OneContentModel/Requirements

We had several action items:

-- ACTION ITEM: Yves to find a real example for requirement #4
--> Sorry, still not done at this time (forgot).
Andrew: The bookmark element in ODF is a good illustration.
Yves: thanks, will use that then.


-- ACTION ITEM: Arle to bring any additional requirements from OSCAR.
Arle: Nothing at the moments.

Arle: Have to step out for a moment.


-- Discussion on requirement #13:
http://lists.oasis-open.org/archives/xliff-inline/201008/msg00016.html

Andrew: Should it me just text? Include codes as well.

Consensus on:

"Must be able to represent separately different flows of text and codes when, in the original format, they are mixed together.
Example 1: In DITA a footnote is stored at the location where it is referred to:
<p>Palouse horses<fn>A Palouse horse is the same as an Appaloosa.</fn> have spotted coats.</p>
This p element contains two separate flows: "Palouse horses have spotted coats" and "A Palouse horse is the same as an Appaloosa."
Example 2: The value of the HTML ALT attribute is stored in the IMG tag and can be within a paragraph:
<p>Click here: <img alt='OK' src='ok.png'/>.</p>


-- Discussion on requirement #14:

Milan: should be optional.
Andrew: is the direction important? 
Milan: could be with different context, so specifying exact relationships may be difficult.

Consensus on:

Should be able to represent the mutual relationships between a nested flow of text and its parent
The format should be able to represent both flows and have some information about their relationships, so the two text can be put in context when needed. 
For example, the relation between the value of an HTML ALT attribute and the paragraph element where it appears should be somehow preserved:
<p>Click here: <img alt='OK' src='ok.png'/>.</p>


-- Discussion on requirement #15:

Milan: for a unique char or string of invalid?
Andrew: experience is one by one.

ACTION ITEM: Yves to check the term used in XML specification "illegal" or "invalid", and use it everywhere.

Consensus on:

Should be able to represent illegal XML characters in the content
Some characters are illegal in XML, but they may appear in extracted text and we should have a common way to represent them so they can be preserved and merged back if necessary, without causing the XML tools to fail. 
For example in the following Java property string "Text with \u001a" the character U+001A is illegal in XML but needs to have a representation in XLIFF.
Note: An example of how some XML formats handle this case is the TS format from Qt-Linguist, which uses a <byte> element to represent such characters.


-- Discussion on requirement #16:

Milan: Look like warning for translator, no?
Yves: For both tool and possibly translator I think.

Consensus on:

Inline codes should have a way to store information about the effect on the segmentation
As some inline codes may have an effect on the segmentation of a given content, it is useful if segmentation-specific hints could be stored along with an inline code. 
For example: In HTML a <BR> element indicates a forced line break, while a <B>...</B> element should not affect the segmentation.


-- Discussion on requirement #17:

Andrew: Not a requirement, more like a guiding principle.
Yves: agree, it may not be possible to implement.
Will should group "guiding principles" in a separate list.

Consensus on:

[guiding principle] If possible, all text nodes of the content should be real text, not codes
When processing the content with XML parsers, all the nodes of type TEXT should contain real text.
This allows the separation between textual content and codes to be physical even in XML tree representation, rather than requiring interpretation of the markup. 
For example, the imaginary representation below stores the native codes [startBold] and [endBold] as part of the content. This is what we want to try to avoid. 
This text is in <code>[startBold]</code>bold<code>[endBold]</code>.
In contrast, the imaginary representation below stores the native codes [startBold] and [endBold] outside the content. Therefore the sum of all TEXT nodes represent only true text. This is what we want to try to achieve. 
This text is in <code native="[startBold]">bold<code native="[endBold]">.Note that this requirement may or may not be possible to achieve, depending on various factor.


Yves: Only one item left!
Let's work on it by email and next meeting we may be able start discussing the implementation.


=== 4) Other Business

None.

-meeting adjourned
Follow-Ups:
- RE: [xliff-inline] Teleconference - Sep-14-2010 - 13:30 UTC -Summary
  - From: <bryan.s.schnabel@tektronix.com>
References:
- Teleconference - Sep-14-2010 - 13:30 UTC
  - From: Yves Savourel <ysavourel@translate.com>