OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: OpenFormula - scope, other info resources?

Here are some thoughts on the scope of "OpenFormula", and
a request for comments about its scope and resources.

First, scope. At the VERY LEAST, "OpenFormula"
should define the format for exchanging
spreadsheet formulas as an extension of the OpenDocument
specification, based on OOo and Excel's current actions.

More is possible. I'd really love to hear some
feedback on the following:
* It'd be nice if this formula language was also used in
   draw:formula & anim:formula.  I don't think it'd be that hard.
   But they're currently in a different format in
   OpenOffice, in particular, they use "," as the function
   separator not ";".  Supporting them requires a few
   minor additions, i.e., $number as a parameter (unfortunately
   starting at 0, which I think is a REALLY bad idea because
   shell and others start at 1), and $ by itself as a variable.
   Is there interest in trying to unify them?  Are small
   changes to draw:formula and anim:formula okay to do this?
* Although I'd like the syntax to be general enough to be
   extensible later, I'm not working on a truly general
   mathematical notation that can handle algebraic unknowns, etc, etc.
   Indeed, people have spreadsheets of important tasks, and we
   need to make SURE that we don't break their formula semantics,
   so I think the semantics of this are VERY constrained by
   the past... else too much breaks.
* Most people SEE formulas on the screen, and type them in...
   should the HUMAN-READABLE format be specified?
   I suspect the answer really should be "yes", or
   at least that they should suggest a format, so that people can
   type in formulas and have a reasonable expectation that they'd work.
* Should formulas, as exchanged, use the same or more similar
   syntax to how they're entered on the screen?
   For example, the current OOo uses square brackets around
   cell references, e.g., "[.A1]" is represented on-screen as "A1".
   Frankly, I think there are lots of advantages to having the
   data format "as close as possible" to the entered format...
   "identical" being the ideal.
   A spreadsheet program has to parse human input anyway, and
   if the format is the same, then you only need to write that
   code once, and debugging it's simpler too.
* For the moment, I think it'd be wiser to declare setting
   display formats based on formula contents as out-of-scope.
   Most spreadsheet programs set the default display format
   based on the operations in the formula and the formats of
   their inputs (e.g., putting in the formula =A1+A2
   to a blank cell will change the format to "currency" if
   A1 and A2 are formatted as currency).  But there doesn't
   seem to be lots of consistency on this point; Excel on the
   Mac, Excel 2002, and Excel 2000 all seem to have different
   heuristics on setting the display format.

I've been hunting everywhere for existing info resources.
There's little available, though there are LOTS of books
on how to point-and-click SUM()s.  Can anyone point
me to other USEFUL materials about the exact semantics of Excel
or OOo? The best materials I've found are:
* "Excel 2000 in a Nutshell", O'Reilly.  The only book with
   REAL content about formula +-semantics.  I've looked at about 20
   books about Excel, and most of the other books are
   a waste of time for this purpose. Not even "Excel Formulas
   and Functions for Dummies" holds a candle to this, even though
   it focuses only on formulas.
* http://sc.openoffice.org/ and in particular
   "OpenOffice.org's Documentation of the Microsoft Excel File Format"
   is an absolute delight for real, meaty technical details.
   It's also scary; it's unbelievable that so many people
   use spreadsheets for exchanging information,
   even though there's no official spec and the Excel spreadsheet
   format (as documented here) is INCREDIBLY fragile.
   The formulas are stored in RPN, and the functions are
   just id'd by number. Getting data & in out of Excel,
   or dealing with corrupted data, is a true nightmare.
   I'll be SO glad to use OpenDocument instead after delving into
   this thing, this format is HORRENDOUS as an exchange format.
* Ximian has a nice LXR setup that lets you easily surf
   the OpenOffice.org spreadsheet source code:
   Of particular interest is OO.o's formula compiler code:
   I haven't read it in depth yet, but it looks like a good
   place to look.

--- David A. Wheeler

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]