OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office] Alternatives for OFFICE-2102

After thinking it over, I find the feature of whitespace handling as useful for end users as flat XML (which is similar unstable as it does not have a generic mapping for XML:ID (potential clashes) and no way to embrace other files of the directory (loose of data). Those developer features are nice for playing around, but not made stable enough for the productive environment to cover all potential edge cases. Therefore, I agree not to waste further time with it. To me, it is absolute fine to fix the wording and for instance make an informal note to state that it only works for most cases but will produce problems with certain cases, such as fields.

In the future, a file format should better spend time in providing open source developer extensions for existing editors such as Atom, emacs, vi.. (whatever is hype). Then there should be no need for such a hunchback as whitespace handling for file applications.
I have changed my mind when I realised that any pretty printing will break existing XML signatures and signatures will become more and more important in the future. In addition, I always wanted to have better ODF text editor support out-of-the-box. Fixing whitespace handling would fix the wrong side of the problem (make implementations of ODF more difficult), but will not provide me as a developer using pretty printing any better usability.

Talk to you later today.

2017-04-02 12:01 GMT+02:00 Svante Schubert <svante.schubert@gmail.com>:
Hello Michael,

I believe we have quite different views on the required approach here.
While to me it seems you are fixing the issue as it was written, minimising the costs for your application, I do not see how it solves the overall problem of the feature the issue is about: Allowing pretty printing in XML by none ODF editors.

We should ask ourselves the questions: 
  1. Why whitespace handling had been added at all to the ODF specification and 
  2. how the intent can be fulfilled or 
  3. should it not be fulfilled and we deprecate all related work?

2017-03-31 15:39 GMT+02:00 Michael Stahl <mstahl@redhat.com>:

On 30.03.2017 19:03, Svante Schubert wrote:
> So why are we doing all this?
> The reason for whitespace handling is likely that ODF applications are
> able to identify and delete additional space inserted by pretty printing
> the XML being done by users in any other text/XML editor.

and we have already seen that this doesn't work perfectly in every case,
and won't work perfectly with generic heuristics, without the
pretty-printer applying ODF-specific rules.

In the mail I had suggested already a possible solution for Jos' use case.
Take your time and give an example, where it does not work in addition, so we might hava a chance to improve the specification.

> There are many variations to do quick fixes to save some time fixing
> existing ODF applications, but just for the theory what would be the
> fix if whitespace handling should work with ODF 1.3?
> It is relative simple:
>  1. Add whitespace elements (text:tab, text:s, text:line-break) in the
>     RelaxNG schema for every descendant of text:p/h that has already
>     character data (perhaps define character data)

let's take the first element from the list of <text:p> child elements
that don't currently do whitespace processing: <dr3d:scene> 10.5.2

it has a child <svg:title>, which allows <text/> content - so you want
to do whitespace processing there since it's a descendant of <text:p>.

however, <dr3d:scene> does not necessarily occur in a paragraph, it may
also occur in a <style:handout-master> element, which is never a
descendant of <text:p>.

do we now say that <svg:title> must have whitespace processing when it
occurrs as a descendant of <text:p>, but not otherwise?  to me that is
the road to madness.

to me a necessary criterion to apply whitespace processing is that the
text content of the <text:p> descendant is conceptually part of the
paragraph text - so all captions on drawing objects and authors on
annotations and that sort of stuff shouldn't do whitespace processing.

Do you really think the exchange of whitespace to elements and vice versa is a road to madness? 
Perhaps when we realize that there is breaking text content, which is not within paragraphs/headers?
Our specification should cover the indendet use case and as long no one comes up with a different explaination, it is all about removing whitespaces inserted by pretty printing.
Or otherwise remove the use case. In either case, make a consequent clean decision.

>  2. Fix the wording consistent to "descendants"

"descendants" is better than "children", and furthermore i would perhaps
move all mention of "descendants" into non-normative notes, and leave
the normative text to say "processing shall be done if and only if the
element allows <text:s> etc. as children".

I can not follow you on the reasoning for the non-normative note. Why is it non-normative? Ar you already jumping to a certain solution in mind?
The additional normative text you suggested, implies already the solution you favored in changing as little as possible in the specification and allow in the future the insertion for additional whitespaces in the content by pretty printing. Is this the best we can do?
>  3. 6.1.2 "White Space Characters"
>     <http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html#White-space_Characters> (and
>     likely other sections) have to be overworked that
>       * ODF 1.3 producers
>          1. Will exchange multiple space characters always to text:s
>             with count attribute
>          2. Will exchange even every single space before and after any
>             descendant element of text:p/h with text:s (to avoid Jos'
>             problem)
>       * ODF 1.3 consumers
>          1. Will remove any space character before and after any
>             descendant element of text:p/h
>          2. Will remove any linebreak and adjacent whitespace characters

as said above i disagree with "any descendant".

>  4. To make the above work, the version attribute(s) shall become
>     mandatory for ODF 1.3, which should be done anyway to ease a
>     developer's life.
> What do you think?

i think we should restrict ourselves to specify something that has as
much backwards compatibility as possible with existing implementations,
with particular regard to how existing consumers will interpret
whitespace in ODF 1.3 documents.

given the current inconsistencies in implementations it's not possible
to make everybody happy, but we should not introduce additional
compatibility breakage that isn't there currently.

If there is something broken in the specifciation, we should consider fixing it.
If the fix requires others to update, the price is not too high in any case. 
Broken backward compatibility is not evil per se.

Have you tried a test document which the incomptabile changes (whitespace elements were they are not allowed)? What happens?

Enjoy your Sunday, Michael!


To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]