Subject: RE: [xliff] Question on xml:space handling
Hi Ryan, Yves,
I know I argued for but lost on making xml:space=”preserve” the default for content data. And have normalization happen at extraction or as a deliberate enrichment step. So I agree with the default preserve point.
I think you should also assume preserve as the default for all third party content as it is likely that tools will put skeleton like data in the files, just like many do in XLIFF 1.2. Removing / changing spaces there would not be a desired behavior.
I agree that a change in behavior is preferable after gathering feedback both internally to MS and externally in the community. We will make the change in our next iteration (unless someone else wants to beat us to the punch J)
Hi Ryan, all,
Looking more closely at the XML specification I agree with Ryan that it’s valid to remove spaces when xml:space is default.
But I would suggest that a better behavior for the elements <source> and <target> would be to either preserve or normalize.
For a logical viewpoint if there is an <ignorable><source> </source></ignorable> entry it is very likely because it separates two segments that will still need that separation after merging. So a better default XLIFF processor behavior would be to either preserve or normalize.
Yves logged two issues on the MS XLIFF OM github project (thanks Yves for testing!):
<source> with only spaces are emptied on deserialization
I have no doubt that the first one is a bug since the default for <data> is preserve.
The second one is trickier. The default for <source> and <target> is to allow the application to decide. In this case, the OM is using .Net XmlReader, which removes whitespace by default. So, in order to preserve whitespace during serialization/deserialization, the user must set xml:space=”preserve” to override the default behavior of XmlReader.
So my question to this esteemed body of professionals: Should the MS XLIFF OM override the default behavior of XmlReader and preserve whitespace when xml:preserve=”default”?