OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [xliff] RE: Preserving spaces inline

Hi Yves, all,
I've implemented this with minor variations in the Docbook source on our SVN. Please note that this is not yet printed as part of the editor's draft.
I expect some more changes from Felix and will print with all changes on Monday next week

Dr. David Filip
OASIS XLIFF TC Secretary, Editor, Liaison Officer
Spokes Research Fellow
ADAPT Centre
KDEG, Trinity College Dublin
Mobile: +420-777-218-122

On Tue, Mar 29, 2016 at 3:44 PM, Yves Savourel <ysavourel@enlaso.com> wrote:

Hi all,


I had an action item to provide a possible re-writing of the section “B.2.1.2 Inline Elements”.


The text below is the proposal.




It is not possible to use [XML namespace] on XLIFF inline elements. It is advised that mixed Preserve Space behavior is NOT used inline in source formats. The recommended way to extract content with mixed Preserve Space behavior is for the Extractor agent to perform the following:


1.  Normalize the whitespace in the content as needed (i.e. preserving whitespace spans where they need to be preserved, normalizing elsewhere).

2.  Then extract the content in a unit with xml:space set to "preserve".


Even if the tool does not do step #1, the result is safe, and it let humans deal with deciding what extra spaces, if any, should be deleted during translation or edit.


Whitespace handling can be also set independently for text segments and ignorable text portions within an Extracted unit and for the source ad target language within the same <segment> or <ignorable> element using the OPTIONAL xml:space attribute at the <source> and <target> elements.




I’ve remove the bit about “extract all discernable portions with uniform whitespace handling into different elements” because units should be decided based on “paragraphs” or equivalent logical structures, but inline formatting.


I’ve removed the bit about “mixed whitespace handling behavior is not likely to survive Segmentation Modification” because the PR for segmentation modification do include action for xml:space in line with the recommended behavior for Extractor agents we would add here.


I’ve removed the bit about “can be also extracted as original data stored outside of the translatable content” (and its corresponding example) because it causes many issues and we do not want to even give the idea to anyone to do something like that.





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]