[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: Preserving spaces inline
I had an action item to provide a possible re-writing of the section “B.2.1.2 Inline Elements”.
The text below is the proposal.
It is not possible to use [XML namespace] on XLIFF inline elements. It is advised that mixed Preserve Space behavior is NOT used inline in source formats. The recommended way to extract content with mixed Preserve Space behavior is for the Extractor agent to perform the following:
1. Normalize the whitespace in the content as needed (i.e. preserving whitespace spans where they need to be preserved, normalizing elsewhere).
2. Then extract the content in a unit with xml:space set to "preserve".
Even if the tool does not do step #1, the result is safe, and it let humans deal with deciding what extra spaces, if any, should be deleted during translation or edit.
Whitespace handling can be also set independently for text segments and ignorable text portions within an Extracted unit and for the source ad target language within the same <segment> or <ignorable> element using the OPTIONAL xml:space attribute at the <source> and <target> elements.
I’ve remove the bit about “extract all discernable portions with uniform whitespace handling into different elements” because units should be decided based on “paragraphs” or equivalent logical structures, but inline formatting.
I’ve removed the bit about “mixed whitespace handling behavior is not likely to survive Segmentation Modification” because the PR for segmentation modification do include action for xml:space in line with the recommended behavior for Extractor agents we would add here.
I’ve removed the bit about “can be also extracted as original data stored outside of the translatable content” (and its corresponding example) because it causes many issues and we do not want to even give the idea to anyone to do something like that.