I concur, xml:space is the right way to go. Performing normalization at extraction and then setting preserve space for the content is likely going to be the safest
thing extractors can do in practice.
Putting whitespace in inline elements I agree is a very bad practice. And can probably lead to issues when extractors are implemented by those that don’t know
about the differences between languages. Like the need to add or remove whitespace between segments in many cases.
From: email@example.com [mailto:firstname.lastname@example.org]
On Behalf Of Patrik Mazanek
Sent: den 7 december 2015 09:11
To: Yves Savourel <email@example.com>; XLIFF Main List <firstname.lastname@example.org>
Subject: RE: [xliff] Preserving spaces inline
I agree, I think xml:space takes care of preserving spaces and that should be good enough.
When it comes to whitespaces, I think it would also make sense to emphasize how important is to use xml:space for <ignorables> element - as it's basically designed to hold whitespaces.
SDL PLC confidential, all rights reserved. If you are not the intended recipient of this mail SDL requests and requires that you delete it without acting upon or copying any of its contents, and
we further request that you advise us.
SDL PLC is a public limited company registered in England and Wales. Registered number: 02675207.
Registered address: Globe House, Clivemont Road, Maidenhead, Berkshire SL6 7DY, UK.
From: email@example.com [mailto:firstname.lastname@example.org] On Behalf Of Yves Savourel
Sent: Thursday, December 3, 2015 5:34 PM
To: XLIFF Main List <email@example.com>
Subject: [xliff] Preserving spaces inline
I'm looking at the 2.1 draft specification (hopefully the latest one):
In section B.2.1.2 Inline Elements:
In my opinion, the end of the section, from the paragraph starting with "Preserved whitespaces can be also extracted as original data stored outside..." should be completely removed, including the example B.7.
I think it is an extremely bad practice to place spans of white spaces into inline codes. It bring many issue during translation and edit, for leveraging, and not to mention that it increases the number of inline codes. I know of one commercial tool that does
that in XLIFF 1.2 when extracting <pre> entries from HTML and we have had tons of issues with such encoding.
If the specification has to provide a solution for preserving the spaces of some section of a segment, I think the simplest and safest way to do it is to preserve the spaces of the whole segment, or unit.
The Extraction tool can do this by:
-1) Normalizing the spaces in the content as needed (i.e. preserving the spans where they need to be preserved, normalizing elsewhere).
-2) Then extract the unit with xml:space="preserve"
Even if the tool does not do #1, the result is simpler and safer, and it let humans deal with deciding what extra spaces if any should be deleted during translation or edit.
To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at:
This message has been scanned for malware by Websense.