[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: XLIFF 2.0: Improving whitespace handling
Hi all, In XLIFF 1.x whitespace is handled via the standard xml:space attribute, and this attribute can be applied to <file>, <group>, <trans-unit> and <alt-trans> elements. These elements are all structural containers in that they do not allow mixed-content children, only child elements. file element: <xsd:attribute ref="xml:space" use="optional"/> group element: <xsd:attribute default="default" ref="xml:space" use="optional"/> trans-unit element: <xsd:attribute default="default" ref="xml:space" use="optional"/> alt-trans element: <xsd:attribute default="default" ref="xml:space" use="optional"/> In addition to these, any element that supports attribute extensions (namely <alt-trans>, <bin-source>, <bin-target>, <bin-unit>, <bpt>, <bx/>, <ept> , <ex/>, <file>, <g> , <group>, <it>, <mrk> , <ph>, <seg-source>, <source>, <target>, <tool>, <trans-unit>, <x/>, and <xliff>), can also use the 'xml:space' attribute. Note that a schema-aware XML parser will add any missing default-value attributes when parsing a document, causing e.g. the xml:space attribute on the <file> element to be somewhat redundant, as this value will be overridden on all <group>, <trans-unit> and <alt-trans> elements. Another example is handling say the <internal-file> element. Here it is very important in some cases that whitespace is preserved. However, to accomplish this, tools need to set this attribute on the <file> element, rather than simply allowing the xml:space attribute on the <internal-file> element. The elements where whitespace handling might be important is the content-containers, where the child content is mixed content or text. These include <seg-source>, <target>, <internal-file>, inline elements, <context> to name a few. Let me give an example of another issue with the current whitespace handling: <trans-unit id='1' xml:space='preserve'> <source>hello world!!</source> </trans-unit> Above, I have set the xml:space attribute to 'preserve' to ensure that my source text doesn't include any additional whitespace. However, there is a problem with this: I do not care about the whitespace e.g. between the <trans-unit> opening tag and the <source> opening tag. The above fragment could just as well have been written as follows: <trans-unit id='1' xml:space='preserve'><source>hello world!!</source></trans-unit> However, according to the XML specification, the two fragments above are not equal, since the xml:space attribute affects all child elements (unless overridden in a child). I don't have an immediate solution for how we solve these issues for XLIFF 2.0, however some initial ideas are: 1) Allow attribute-extensions for all mixed-content-elements in the specification, or at least all where xml:space would make a difference 2) Create a section about Whitespace handling in the specification, especially adopting a convention that any <target> element should use the same whitespace handling as their sibling <source> element. 3) Let XML processors do what they do best, they will honour the xml:space attribute :) cheers, asgeir
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]