OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office] Fwd: ODF spec question (white-space processing)

Daniel, David, all,

thank you very much for the valuable discussion. I think we reached an 
agreement how OpenDocument shall behave regarding white-space at the 
beginning of paragraphs, and I will craete a proposal how to clarify that in 
the specification soon.

Some more comments are inline:

Daniel Vogelheim wrote:
> Hi David,
> You wrote:
>>On Monday 11 September 2006 01:04, Daniel Vogelheim wrote:
> I am curious as to whether those same people would expect the
> following to not have any whitespace, too:
>   <text:p>
>     <text:span>
>         My paragraph text
>     </text:span>
>   </text:p>
> Which, according to absolutely everybody :), it seems, will have
> whitespace.

If we want ODF documents to render the same as HTML documents, then I think 
we should clarify that all spaces before the "My" are ignored in this 
example, too.

Actually, the collapsing of white space characters in ODF is already defined 
to occur even for white-space character sequences that have start and end 
tags inside of it. I therefore think it would only be consequent to assume 
this behavior also for spaces at the paragraph start.

> However: According to the SGML whitespace processing rules, which were
> sort of emulated in the early HTML specs, which were sort of evolved
> into the HTML 4 rule(s) that we have been discussing in this thread,
> this would NOT have been the case. That is, for all I can see, not
> clearly defined in the HTML spec. (That is exactly the *optional* part
> of the whitespace suppression behind start tags.)

Well, my reading of the HTML 4 specification is as follows: The HTML 
specification says that white-space characters are word-delimiters, and that 
HTML layouts only words. For that reason it makes no difference whether there 
is a <span> tag between <p> and the word "My", as both cases result in a 
couple of word delimiters, followed by the word "my". Therfore, both examples 
will be displayed the same.

I agree to Daniel that the sentence starting with "In order to avoid problems 
with SGML line break rules and inconsistencies" adds an ambiguity (Daniel, 
thanks for pointing me to this sentence). My reading of this sentence 
actually is that it shall alarm authors about an interoperability issue, 
because legacy application may not interpret a space character as word 
delimeter if it occurs immediately behind a start tag or immediately before 
an end tag, allthough the HTML 4 specification wants them to do so. This may 
have an effect on spaces that occur between words, because it here it of 
cause makes a difference whether there is a word delimiter or not. But this 
legacy behavior IMHO may not have an influence on spaces at the beginning of 
the paragraph, because here, we are always at the beginning of a word. And 
for an application that lays out words only, it IMHO does not make a 
difference whether there are additional word-delimiters in front of it. This 
means, there are ambiguities (or interop issues) with spaces between words, 
but not at the beginning and the end of the paragraph. However, that's my 
personal interpretation only, and like Daniel, I don't want to re-open the 

But the room for interpretion that the HTML specification allows convinces me 
that the ODF should specify the white-space behavior without a normative 
reference to the HTML spec in the future.

Best regards


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]