RE: [xliff] RE: Preserving spaces inline

Great. Thanks.

I’ll double-check when the PDF is available.

-ys

From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of David Filip
Sent: Thursday, March 31, 2016 10:19 AM
To: Yves Savourel <ysavourel@enlaso.com>
Cc: XLIFF Main List <xliff@lists.oasis-open.org>
Subject: Re: [xliff] RE: Preserving spaces inline

Hi Yves, all,

I've implemented this with minor variations in the Docbook source on our SVN. Please note that this is not yet printed as part of the editor's draft.

I expect some more changes from Felix and will print with all changes on Monday next week

Cheers

Dr. David Filip

===========

OASIS XLIFF OMOS TC Chair

OASIS XLIFF TC Secretary, Editor, Liaison Officer

Spokes Research Fellow

ADAPT Centre

KDEG, Trinity College Dublin

Mobile: +420-777-218-122

On Tue, Mar 29, 2016 at 3:44 PM, Yves Savourel <ysavourel@enlaso.com> wrote:

Hi all,

I had an action item to provide a possible re-writing of the section “B.2.1.2 Inline Elements”.

The text below is the proposal.

==========

It is not possible to use [XML namespace] on XLIFF inline elements. It is advised that mixed Preserve Space behavior is NOT used inline in source formats. The recommended way to extract content with mixed Preserve Space behavior is for the Extractor agent to perform the following:

1. Normalize the whitespace in the content as needed (i.e. preserving whitespace spans where they need to be preserved, normalizing elsewhere).
2. Then extract the content in a unit with xml:space set to "preserve".

Even if the tool does not do step #1, the result is safe, and it let humans deal with deciding what extra spaces, if any, should be deleted during translation or edit.

Whitespace handling can be also set independently for text segments and ignorable text portions within an Extracted unit and for the source ad target language within the same <segment> or <ignorable> element using the OPTIONAL xml:space attribute at the <source> and <target> elements.

==========

I’ve remove the bit about “extract all discernable portions with uniform whitespace handling into different elements” because units should be decided based on “paragraphs” or equivalent logical structures, but inline formatting.

I’ve removed the bit about “mixed whitespace handling behavior is not likely to survive Segmentation Modification” because the PR for segmentation modification do include action for xml:space in line with the recommended behavior for Extractor agents we would add here.

I’ve removed the bit about “can be also extracted as original data stored outside of the translatable content” (and its corresponding example) because it causes many issues and we do not want to even give the idea to anyone to do something like that.

Cheers,
-yves

xliff message