OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [office] Fwd: ODF spec question (white-space processing)


Daniel,

Daniel Vogelheim wrote:
> Hello,
> 
> Sorry I'm a bit late with this, but I had some trouble with the email
> list.
> 
> 
>>David Faure wrote:
>>
>>>In 5.1.1 (page 84) it specifies that extra white space characters are 
>>>ignored.
>>>I read this to be about 
>>>- more then one literal consecutive whitespace
>>>- any literal whitespace following a text:c or text:tab element.
>>>
>>>OOo adds a case that I don't agree with:
>>>- any whitespace after an opening text:p tag.
>>>
>>>So  <text:p>         foo</text:p>
>>>will have only one word and zero spaces in Writer.
>>>I expect it to have 1 space and one word.
> 
> 
> Me too.
> 
> Michael Brauer wrote:
> 
>>The correct interpretation is to ignore white space characters at the 
>>beginning of the paragraph, as OOo does. The explanation for this is in 
>>section 5.1.1, first paragraph
>>
>>"If the paragraph element or any of its child elements contains white-space 
>>characters, they are collapsed, in other words they are processed in the same 
>>way that [HTML4] processes them."
>>
>>HTML ignores white space characters behind the start element tag, 
> 
> 
> Actually, HTML does no such thing. Neither the HTML spec nor actual
> HTML browsers remove existing whitespace behind a start element tag.

What browser are you using? At least my Mozilla as well as Firefox doesn't 
display them.

> 
> Additionally, HTML optionally (!) allows whitespace just after/before
> to be ignored FOR LAYOUT. (Apparently, this is a legacy thing from
> older HTML versions.) If we really wish to be compatible to this

Where did you find any information about the handling of whitespace 
before/after a paragraph's text? I didn't find anything in the HTML4.01 spec, 
so a reference would be very helpful here.

But, if one only looks at words, then it is also only consistent to ignore 
white space characters at the begin and end of paragraphs. That's what 
current browser implementations do. And what OpenDocument does, too.

> 
> On the spec itself:
> 
>>"If the paragraph element or any of its child elements contains white-space 
>>characters, they are collapsed, in other words they are processed in the same 
>>way that [HTML4] processes them."

Well, the intention behind the white-space processing rules is to allow 
authors to pretty-print paragraph text. HTML is used as an archetype here, 
because its rules do work very well in practice. It may be that we could find 
some better wording for the relation of the OpenDocument white space 
processing rules to HTML, but IMHO it is consistent with the HTML 
specification to ignore white space characters at the paragraph start.

Best regards

Michael


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]