OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Formula subcommittee status


All -- here is a status report about the OpenDocument
formula subcommittee, as I see it.

The OASIS formula subcommittee continues to develop
the recalculated formula specification. It is primarily for spreadsheets,
but it's devised so that it could be useful for other purposes.
Even without this specification, many of today's spreadsheet files
can be exchanged between ODF-supporting applications,
but we want to do much better. Our goal is to ensure that users
can switch from one application to another, and exchange data
with users of different applications, all while recalculating correctly.

Developing any good standard requires representatives from
multiple implementors, and we are blessed to have a large list!
We have reps from OpenOffice.org and Sun StarOffice (Eike Rathke),
KDE KOffice (David Faure and Tomas Mecir), Gnumeric
(Andreas J. Guelzow and Jody Goldberg),
IBM/Lotus 1-2-3 (Rob Weir), and wikiCalc (Dan Bricklin,
co-creator of the spreadsheet). We also have many experienced
users (such Tom Metcalf, a scientist specializing in the astrophysics of the Sun).
Several mathematicians, both users and developers, are a part.
Our most recently added member is Dr. Andreas J. Guelzow,
the Gnumeric developer leading the implementation
of its OpenDocument import and export. Dr. Guelzow has a PhD in
Mathematics from the University of Manitoba in Canada, and is
a full professor, department chair, and faculty association president
at Concordia University College of Alberta.  There are many other
excellent people who are participating; my thanks to all.

One of the first actions of the group was that we accepted
the OpenFormula project's specification as the base document, so we
haven't had to start writing a specification from scratch.
OpenDocument is able to support multiple formula formats, so
after discussion we agreed on giving this formula specification
a particular name. We have decided to keep the name "OpenFormula"
for this specific format, with the conventional prefix "of:" and namespace
"urn:oasis:names:tc:opendocument:xmlns:openformula:1.0";
table cell entries end up looking like this:
  <table:table-cell table:formula="of:=5+2*3"></table:table-cell>
I'd encourage applications begin getting ready to read this namespace.
I had originally thought we'd change the name to something else,
but good names are hard to find, and this one works.

We've developed the syntax, which is based on existing practice.
When exchanged in files, cell addresses are surrounded by [...],
as OpenOffice.org does; this makes it trivial to distinguish between
named expressions and cell addresses, so the notation can
support arbitrarily wide tables and arbitrary names. We learned that
implementations differed on whether "^" was left-to-right or
right-to-left associative; after a little discussion, it was agreed to be
left-to-right, and both KSpread and wikiCalc were changed within
the week of that decision to use left-to-right association for "^".
The defined syntax also avoids some nasty problems like space
being both an operator (intersection) and ignorable; the "~"
operator is used in files to represent cell intersection.

We also have a definition of the types. It seems odd, but
the types in spreadsheets have never been clearly defined.
Some applications have a distinguished logical type, and others
do not; we've decided to leave that as implementation-defined,
and explain that portable document creators must not depend on
logical being a separate type or not. This is easy to do, and
this is true for practically all spreadsheet files anyway.
We've left implementation-defined what happens when a number is
requested but a text type is found; "Automatically" doing this
conversion turns out to have many pitfalls (including internationalization
issues).

We have agreed that applications may need to create specialized
subsets and supersets, and made that possible. However,
users need to know what capabilities they can typically count on for
general-purpose use, so we have identified several groups of
capabilities to aid those users. The "small" group identifies a
set of a little over 100 functions, which can be implemented
on PDAs and smaller applications. This group actually quite robust;
it includes many database and financial functions, and at least
one Palm PDA spreadsheet implements them. Originally wikiCalc was only
going to implement a few functions, but after looking at this list,
its implementors decided to implement the entire "small" list
(and have already succeeded).
All of the functions in the "small" set are defined; we are now
completing the other functions' definition for the "large" group.

We have switched from a Wiki to using an OpenDocument file
as the specification.  We had anticipated doing that sooner or later.

One key aspect inherited from the OpenFormula project is that
test cases are included in the definitions. This is incredibly
valuable; test cases greatly clarify the defining text, and we
can apply the test cases to many different applications.
In the end, recalculated formulas aren't portable unless they
get the "same" answers everywhere, and test cases make
this far more likely. Daniel Carrera has developed an
XSLT-based program that generates test spreadhsheets from
the documented test cases, similar to Wheeler's older tool
for MediaWiki.  Though not complete it's working well, and
the tool should be complete soon.

Recently IBM has contributed the Lotus 1-2-3v9.8 "help" file
documentation to OASIS, to help us accelerate development
of the specification. My sincere thanks to IBM and particularly
Robert Weir for making this possible.

This is a lengthy summary, yet it doesn't list all that's been done;
my apologies for my omissions ahead-of-time!

There are still "TODOs" in various places, but we're making progress.
Several people are now working to finish defining the remaining
functions to be defined, in particular.  It may seem odd, but working
out the general framework and defining the most common functions are
where the largest time needed to be spent; there are more functions
to define, but we expect progress now to be much faster now that
key decisions and the framework are in place.
But we could always use help from those with expertise.
If you are interested in helping, please join us!

--- David A. Wheeler


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]