OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office-formula message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Whitespace fixups

Dennis E. Hamilton (dennis.hamilton at acm.org) pointed out an error
in the OpenFormula's description of whitespace handling. See:

In particular, if there is an embedded newline, do we use \n or \r?
It explicitly forbids CRLF, but one place says "\n" and the other says "\r". I propose that it be \n (I believe that's what we meant), but exclusively use Unicode code points so that it's unambiguous.

We also need to clarify the text - there's a bunch of "may"s, but really, we want to REQUIRE that applications ignore whitespace when calculating results.  (They can retain them for display, of course, but we want to make sure that everyone else accepts the whitespace too so that the whitespace doesn't have to be removed for storage). Here's a draft, which I intend to put in the document today (it's a tweak of current text):


For calculation purposes, whitespace is generally ignored unless it is inside the contents of string constants or text surrounded by single quotes. Specifically, applications shall ignore any whitespace characters before and/or after any operators, constant numbers, constant strings, constant errors, inline arrays, parentheses used for controlling precedence, and the closing parenthesis of a function call. Whitespace shall be ignored following the initial equal sign(s). Whitespace shall be ignored just before a function name, but whitespace shall not separate a function name from its initial opening parentheses. Whitespace shall not be used in the interior of a terminating grammar rule (a rule that references no other rule other than character sets, internally or externally-defined), unless specifically permitted by the terminating grammar rule, since these rules define the lexical properties of a component. As a result, applications shall not write formulas with whitespace embedded in any unquoted identifier, constant number, or constant error. Thus “= 3 . 5 + 3” is not a legal formula, because the syntax for Number does not permit interior whitespace, but “= 3.5 + 3 ” is a legal formula. Note that “= 3 %” is also legal. Applications shall consider the following characters as whitespace characters: space (U+0020), tab (U+0009), newline (U+000A), and carriage return (U+000D).

An embedded line break shall be represented by a single newline character (U+000A), not by a carriage return-linefeed pair. When embedded in an XML document the newline character is typically represented as “&#0A;”.

--- David A. Wheeler 

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]