OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office] Alternatives for OFFICE-2102

Dear TC,

Let me try to do a new summarization of problems of the broken whitespace handling in ODF 1.2
  1. Not all descendants of text:p/h allow the whitespace elements (text:tab, text:s, text:line-break) in the RelaxNG schema, although 6.1.2 "White Space Characters" §2 is requiring whitespace handling for all their descendants, which allow character data.
  2. 6.1.2 "White Space Characters" is referring to descendants of text:p/h, while 3.18 White Space Processing and EOL Handling is referring to children in the notes. Descendants are correct as already text:span might be nested and the handling should not only apply to the most upper text:span element.
  3. Existing whitespace handling as described in 6.1.2 "White Space Characters"  is not working as Jos mentioned in our last call. Jos provided a simple example, where pretty printing will add new space character into the document.
    The original document:


    if being pretty printed will end up as


    The space inserted for pretty printing in between the span will be compressed to a single space, which is still too much.
    NOTE: An ODT test document with the pretty printed XML and the additional space can be found attached to this mail.
    In addition, I have added a second version, where I have added some text in-between the spans and broke manually (like some custom pretty printer before & after elements).

So why are we doing all this?
The reason for whitespace handling is likely that ODF applications are able to identify and delete additional space inserted by pretty printing the XML being done by users in any other text/XML editor.

There are many variations to do quick fixes to save some time fixing existing ODF applications, but just for the theory what would be the fix if whitespace handling should work with ODF 1.3?

It is relative simple:
  1. Add whitespace elements (text:tab, text:s, text:line-break) in the RelaxNG schema for every descendant of text:p/h that has already character data (perhaps define character data)
  2. Fix the wording consistent to "descendants"
  3. 6.1.2 "White Space Characters" (and likely other sections) have to be overworked that 
    • ODF 1.3 producers
      1. Will exchange multiple space characters always to text:s with count attribute
      2. Will exchange even every single space before and after any descendant element of text:p/h with text:s (to avoid Jos' problem)
    • ODF 1.3 consumers
      1. Will remove any space character before and after any descendant element of text:p/h
      2. Will remove any linebreak and adjacent whitespace characters
  4. To make the above work, the version attribute(s) shall become mandatory for ODF 1.3, which should be done anyway to ease a developer's life.
What do you think?

Hope it helps,

2017-03-13 17:33 GMT+01:00 Regina Henschel <regina.henschel@libreoffice.org>:
Hi all,

I have tried to sort out the alternatives for OFFICE-2102 for me. I have attached it.

Please correct it where necessary and add your aspects in case I forgot something.

Kind regards

To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:

Attachment: BrokenWhitespacehandling.odt
Description: application/vnd.oasis.opendocument.text

Attachment: BrokenWhitespacehandling2.odt
Description: application/vnd.oasis.opendocument.text

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]