[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Formula subcommittee status
All -- here is a status report about the OpenDocument formula subcommittee, as I see it. The OASIS formula subcommittee continues to develop the recalculated formula specification. It is primarily for spreadsheets, but it's devised so that it could be useful for other purposes. Even without this specification, many of today's spreadsheet files can be exchanged between ODF-supporting applications, but we want to do much better. Our goal is to ensure that users can switch from one application to another, and exchange data with users of different applications, all while recalculating correctly. Developing any good standard requires representatives from multiple implementors, and we are blessed to have a large list! We have reps from OpenOffice.org and Sun StarOffice (Eike Rathke), KDE KOffice (David Faure and Tomas Mecir), Gnumeric (Andreas J. Guelzow and Jody Goldberg), IBM/Lotus 1-2-3 (Rob Weir), and wikiCalc (Dan Bricklin, co-creator of the spreadsheet). We also have many experienced users (such Tom Metcalf, a scientist specializing in the astrophysics of the Sun). Several mathematicians, both users and developers, are a part. Our most recently added member is Dr. Andreas J. Guelzow, the Gnumeric developer leading the implementation of its OpenDocument import and export. Dr. Guelzow has a PhD in Mathematics from the University of Manitoba in Canada, and is a full professor, department chair, and faculty association president at Concordia University College of Alberta. There are many other excellent people who are participating; my thanks to all. One of the first actions of the group was that we accepted the OpenFormula project's specification as the base document, so we haven't had to start writing a specification from scratch. OpenDocument is able to support multiple formula formats, so after discussion we agreed on giving this formula specification a particular name. We have decided to keep the name "OpenFormula" for this specific format, with the conventional prefix "of:" and namespace "urn:oasis:names:tc:opendocument:xmlns:openformula:1.0"; table cell entries end up looking like this: <table:table-cell table:formula="of:=5+2*3"></table:table-cell> I'd encourage applications begin getting ready to read this namespace. I had originally thought we'd change the name to something else, but good names are hard to find, and this one works. We've developed the syntax, which is based on existing practice. When exchanged in files, cell addresses are surrounded by [...], as OpenOffice.org does; this makes it trivial to distinguish between named expressions and cell addresses, so the notation can support arbitrarily wide tables and arbitrary names. We learned that implementations differed on whether "^" was left-to-right or right-to-left associative; after a little discussion, it was agreed to be left-to-right, and both KSpread and wikiCalc were changed within the week of that decision to use left-to-right association for "^". The defined syntax also avoids some nasty problems like space being both an operator (intersection) and ignorable; the "~" operator is used in files to represent cell intersection. We also have a definition of the types. It seems odd, but the types in spreadsheets have never been clearly defined. Some applications have a distinguished logical type, and others do not; we've decided to leave that as implementation-defined, and explain that portable document creators must not depend on logical being a separate type or not. This is easy to do, and this is true for practically all spreadsheet files anyway. We've left implementation-defined what happens when a number is requested but a text type is found; "Automatically" doing this conversion turns out to have many pitfalls (including internationalization issues). We have agreed that applications may need to create specialized subsets and supersets, and made that possible. However, users need to know what capabilities they can typically count on for general-purpose use, so we have identified several groups of capabilities to aid those users. The "small" group identifies a set of a little over 100 functions, which can be implemented on PDAs and smaller applications. This group actually quite robust; it includes many database and financial functions, and at least one Palm PDA spreadsheet implements them. Originally wikiCalc was only going to implement a few functions, but after looking at this list, its implementors decided to implement the entire "small" list (and have already succeeded). All of the functions in the "small" set are defined; we are now completing the other functions' definition for the "large" group. We have switched from a Wiki to using an OpenDocument file as the specification. We had anticipated doing that sooner or later. One key aspect inherited from the OpenFormula project is that test cases are included in the definitions. This is incredibly valuable; test cases greatly clarify the defining text, and we can apply the test cases to many different applications. In the end, recalculated formulas aren't portable unless they get the "same" answers everywhere, and test cases make this far more likely. Daniel Carrera has developed an XSLT-based program that generates test spreadhsheets from the documented test cases, similar to Wheeler's older tool for MediaWiki. Though not complete it's working well, and the tool should be complete soon. Recently IBM has contributed the Lotus 1-2-3v9.8 "help" file documentation to OASIS, to help us accelerate development of the specification. My sincere thanks to IBM and particularly Robert Weir for making this possible. This is a lengthy summary, yet it doesn't list all that's been done; my apologies for my omissions ahead-of-time! There are still "TODOs" in various places, but we're making progress. Several people are now working to finish defining the remaining functions to be defined, in particular. It may seem odd, but working out the general framework and defining the most common functions are where the largest time needed to be spent; there are more functions to define, but we expect progress now to be much faster now that key decisions and the framework are in place. But we could always use help from those with expertise. If you are interested in helping, please join us! --- David A. Wheeler
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]