OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office] Fwd: ODF spec question (white-space processing)

Hello all,

Michael Brauer wrote:
>>>The correct interpretation is to ignore white space characters at the 
>>>beginning of the paragraph, as OOo does. The explanation for this is in 
>>>section 5.1.1, first paragraph
>>>"If the paragraph element or any of its child elements contains white-space 
>>>characters, they are collapsed, in other words they are processed in the same 
>>>way that [HTML4] processes them."
>>>HTML ignores white space characters behind the start element tag, 

>> Actually, HTML does no such thing. Neither the HTML spec nor actual
>> HTML browsers remove existing whitespace behind a start element tag.
>What browser are you using? At least my Mozilla as well as Firefox doesn't 
>display them.

Firefox. And yes indeed, it doesn't DISPLAY them. It doesn't REMOVE
them either. HTML has different LAYOUT than does OpenDocument.

>> Additionally, HTML optionally (!) allows whitespace just after/before
>> to be ignored FOR LAYOUT. (Apparently, this is a legacy thing from
>> older HTML versions.) If we really wish to be compatible to this
>Where did you find any information about the handling of whitespace 
>before/after a paragraph's text? I didn't find anything in the HTML4.01 spec, 
>so a reference would be very helpful here.

Thank you for asking. You can find it in the HTML 4.01 spec in chapter
9.1, "White space". The final paragraph starting with "In order to
avoid problems" notes that many implementations do not render
whitespace following a start tag. It tells authors not to rely on this
behaviour, thus indicating that suppression of display of whitespace
just after a start element is undesired, but a) common and b) allowed.
Of course, it also allows the display of such whitespace.

Whitespace just after a paragraph element is the special case of this.
THIS is the phenomenon you are observing. If for some reason you wish
that OpenDocument be compatible with it you should introduce an
appropriate layout flag.

>But, if one only looks at words, then it is also only consistent to ignore 
>white space characters at the begin and end of paragraphs. That's what 
>current browser implementations do. And what OpenDocument does, too.

I think you meant to say 'OpenOffice.org' instead of 'OpenDocument'.

Anyway, as said above: HTML layout rules indeed allow supression of
whitespace at the beginning of paragraphs. Many browsers do this. But
they NEVER remove it from content. The whitespace is all there when
you look at the source, when you examine the DOM, when JavaScript
accesses the document.

For an output format like HTML, it doesn't make much difference
whether something is in the content or the layout. For OpenDocument it

>> On the spec itself:
>>>"If the paragraph element or any of its child elements contains white-space 
>>>characters, they are collapsed, in other words they are processed in the same 
>>>way that [HTML4] processes them."
>Well, the intention behind the white-space processing rules is to allow 
>authors to pretty-print paragraph text. HTML is used as an archetype here, 
>because its rules do work very well in practice. 

Interestingly, HTML has adopted an approach to fully represent that
pretty-printing inside the document content and have the layout handle
it. I do not think that is appropriate for OpenDocument, nor is that
the intent of the spec.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]