OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office] Fwd: ODF spec question (white-space processing)


David Faure wrote:
> On Monday 04 September 2006 11:26, Michael Brauer - Sun Germany - ham02 - Hamburg wrote:
>>Daniel Vogelheim wrote:
>>>>David Faure wrote:
>>>>>So  <text:p>         foo</text:p>
>>>>>will have only one word and zero spaces in Writer.
>>>>>I expect it to have 1 space and one word.
> Note that this would indeed be "collapsing" (to a single space), instead of removing.
> The spec does talk about collapsing, not about removing.

Well, the sentence you are refering to continues with "they [white-space 
characters] are collapsed, in other words they are processed in the same way 
that [HTML4] processes them"

The term "collapsed" may be a little bit unprecise here, but the essential 
information is that they are processed as in HTML.

>>>>HTML ignores white space characters behind the start element tag, 
>>>Actually, HTML does no such thing. Neither the HTML spec nor actual
>>>HTML browsers remove existing whitespace behind a start element tag.
>>What browser are you using? At least my Mozilla as well as Firefox doesn't 
>>display them.
> Mozilla does keep one space after a start element, in my tests:
> <html>
>     <body>
>        Foo<span>          bar</span>
>     </body>
> </html>
> This shows "Foo bar" in mozilla (and in konqueror), as I expected,
> and not "Foobar".
> White space is collapsed, not removed.

It seems that we are mixing the start and end tags of paragraphs and of 
markup inside a paragraph here.

The inital example refers to the paragraph start tag <text:p>. Here, the 
white space characters in fact are removed. Mozilla does the same with <p> 
tags, so OpenDocument (and also OOo) behave like HTML here.

Your example is about start tags within a paragraph. Here, not only Mozilla 
but also OpenDocument (and OOo) keep a single space. This is defined by the 
OpenDocument specification as follows: "The preceding character can be 
contained in the same element, in the parent element, or in the preceding 
sibling element, as long as it is contained within the same paragraph element 
and the element in which it is contained processes white-space characters as 
described above."

So, I think we are actually on the same page.

>>Well, the intention behind the white-space processing rules is to allow 
>>authors to pretty-print paragraph text. HTML is used as an archetype here, 
>>because its rules do work very well in practice. It may be that we could find 
>>some better wording for the relation of the OpenDocument white space 
>>processing rules to HTML, but IMHO it is consistent with the HTML 
>>specification to ignore white space characters at the paragraph start.
> But HTML doesn't do that, and therefore OpenDocument shouldn't do it either.

What do you think HTML is not doing? In your above example, white-space 
characters are collapsed like in OpenDocument.

And if I try,

<p>    Foo</p>

in my Mozilla, it does not display a space character in front of the "Foo" - 
just like in OpenDocument. Is your Mozilla behaving different?

In any case, I think it is very convenient to give authors the possibility to 
add a line break behind the opening tag of a paragraph without influencing 
the layout of the document. Paragraph start tags may get very long. I 
personally wouln't like it if I would have to add a paragraph's first word 
always immediately behind the start tag.

Best regards


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]