OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: XLIFF 2.0: Improving whitespace handling

Hi all,

In XLIFF 1.x whitespace is handled via the standard xml:space attribute, and 
this attribute can be applied to <file>, <group>, <trans-unit> and 
<alt-trans> elements. These elements are all structural containers in that 
they do not allow mixed-content children, only child elements.

file element:
<xsd:attribute ref="xml:space" use="optional"/>

group element:
<xsd:attribute default="default" ref="xml:space" use="optional"/>

trans-unit element:
<xsd:attribute default="default" ref="xml:space" use="optional"/>

alt-trans element:
<xsd:attribute default="default" ref="xml:space" use="optional"/>

In addition to these, any element that supports attribute extensions (namely 
<alt-trans>, <bin-source>,  <bin-target>,  <bin-unit>, <bpt>,  <bx/>, 
<ept>  , <ex/>,  <file>,  <g>  , <group>,  <it>, <mrk> ,  <ph>,  
<seg-source>,  <source>,  <target>,  <tool>,  <trans-unit>,  <x/>, and  
<xliff>), can also use the 'xml:space' attribute.

Note that a schema-aware XML parser will add any missing default-value 
attributes when parsing a document, causing e.g. the xml:space attribute on 
the <file> element to be somewhat redundant, as this value will be overridden 
on all <group>, <trans-unit> and <alt-trans> elements. 

Another example is handling say the <internal-file> element. Here it is very 
important in some cases that whitespace is preserved. However, to accomplish 
this, tools need to set this attribute on the <file> element, rather than 
simply allowing the xml:space attribute on the <internal-file> element. 

The elements where whitespace handling might be important is the 
content-containers, where the child content is mixed content or text. These 
include  <seg-source>, <target>, <internal-file>, inline elements, <context> 
to name a few. 

Let me give an example of another issue with the current whitespace handling:

<trans-unit id='1' xml:space='preserve'>
  <source>hello world!!</source>

Above, I have set the xml:space attribute to 'preserve' to ensure that my 
source text doesn't include any additional whitespace. However, there is a 
problem with this: I do not care about the whitespace e.g. between the 
<trans-unit> opening tag and the <source> opening tag. The above fragment 
could just as well have been written as follows:

<trans-unit id='1' xml:space='preserve'><source>hello 
However, according to the XML specification, the two fragments above are not 
equal, since the xml:space attribute affects all child elements (unless 
overridden in a child).

I don't have an immediate solution for how we solve these issues for XLIFF 
2.0, however some initial ideas are:
1) Allow attribute-extensions for all mixed-content-elements in the 
specification, or at least all where xml:space would make a difference
2) Create a section about Whitespace handling in the specification, especially 
adopting a convention that any <target> element should use the same 
whitespace handling as their sibling <source> element. 
3) Let XML processors do what they do best, they will honour the xml:space 
attribute :)


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]