[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Summary of OpenFormula
I promised to write a summary about the OpenFormula specification for those who weren't familiar with it, so below is my summary. Background: We need to decide if we should use OpenFormula as a base document, or if we should start from a blank sheet of paper (we can CHANGE anything... we just need to know what to start from). The people who I thought might object have said they'd be happy to use OpenFormula as a base document. So tomorrow, I plan to ask if anyone objects or needs more time to study OpenFormula. If ANYONE says they want more time (at the teleconference or by end of the day March 2) I believe they should have more time -- but if that's so, let's make a decision by March 10. But if everyone's happy with OpenFormula as a base document, or doesn't care, then let's not wait for no reason; we have work to do. =============================== Summary of OpenFormula: OpenFormula is a draft specification for spreadsheet formulas, offered to OASIS as a contribution. It is based on analyzing many different spreadsheet applications, in particular Excel, OpenOffice.org Calc, Gnumeric, KSpread, Lotus 1-2-3, Quattro Pro, and SheetToGo. The syntax is derived from OpenOffice.org 2's storage format. The semantics are mostly based on what is common between applications, though at higher levels it picks specific semantics based on Excel / Gnumeric / SheetToGo. The "scope" section of OpenFormula describes its purpose: "OpenFormula is an open format for exchanging recalculated formulas between office application implementations, particularly for spreadsheets. OpenFormula defines the types, syntax, and semantics for calculated formulas, including many predefined functions and operations, so that formulas can be exchanged between applications and produce substantively equal outputs when recalculated with equal inputs. Both closed and open source software can implement OpenFormula. OpenFormula is intended to be a supporting document to the Open Document Format for Office Applications (OpenDocument) format, particularly for defining its attributes table:formula and text:formula. It can be used in other circumstances where a simple, easy-to-read infix notation is desired for exchanging recalculated formulas." The scope section also says "OpenFormula does not define: * the user interface. User interfaces may use different syntaxes, different function names, and/or different parameter orders. * the display format or controls over it. * internal representations. * a general notation for mathematics. * a full-fledged programming language. OpenFormula formulas calculate a single result and return it. By design, most operations and functions are free of side-effects, and it is not possible to loop forever." Here is the outline of the document: 1. Scope 2. Normative References 3. Document Notation and Conventions 4. Conformance - including test case conventions and test data 5. Basic Concepts 6. Types 7. Expression Syntax 8. Standard Operators and Functions. This is subdivided further: 1. Common Template for Functions and Operators 2. Implicit Conversion Operators (Type Conversion) 3. Standard Operators 4. List of Functions 5. Array Functions 6. Database Functions 7. Date and Time Functions 8. Financial Functions 9. Information Functions 10. Lookup Functions 11. Logical Functions 12. Mathematical Functions 13. Statistical Functions 14. Text Functions 9. Relationship with OpenDocument 10. Miscellaneous 11. Guidelines for Users (Non-normative) 12. User Input-Output (Non-normative) 13. Development of this Specification (Non-normative) 14. Rules for Contributors 15. License Terms 16. Bibliography (Non-normative) OpenFormula has "levels". A "level" describes the amount of functionality provided by an OpenFormula implementation or required by a given file containing formulas. There are four levels, where each level is a strict superset of lower levels: * Level 1: Minimum implementation. This level provides the minimal capabilities (including functions, types, and their meanings) that are very widely implemented with spreadsheet applications, even in resource-contrained environments. It includes 109 functions. * Level 2: Minimum desktop implementation. This level provides the minimal capabilities expected from typical desktop spreadsheets. Many current desktop implementations meet or nearly meet this level. * Level 3: Basic implementation. This level provides all the capabilities necessary for typical desktop spreadsheet use. Many current implementations meet or nearly meet this level. This adds a distinguished logical type and support for complex numbers. * Level 4: Full implementation. This level provides some additional, less-commonly used features intended for advanced users. Note that this a little different than the emerging consensus in the office-formula group; many seem to be interested in having a finer-grained notion of "packages". But the basic idea of making it possible to state simply WHAT an application implements, or a spreadsheet file requires, is the same. The OpenFormula spec goes on to say: "A conforming application MUST report the highest level it conforms to, if it claims conformance to this specification.... all conforming applications MUST meet the level 1 requirements to claim conformance to this specification. Applications MAY accept extensions as long as they do not interfere with processing data in the format defined here. In particular, applications MAY support additional formulas and operations, additional optional parameters for functions, or make certain function parameters optional when they are required by this specification. It is RECOMMENDED that applications clearly document their extensions in all user documentation, both online and paper, in a manner that users would be very likely to be aware when they are using a non-standard extension. Applications MAY partly implement a higher level of the specification." The OpenFormula spec also says: "This specification's text is written as a description of the requirements of an implementing application. However, documents can also comply (or fail to comply) with this specification. A conforming document MUST be correctly recalculable given only the specification given here. To claim conformance, a conforming document MUST report the lowest level it conforms to. A document with no formulas is notionally given level 0. Note that some applications MAY be able to read and write some documents of a higher level than their compliance level, because applications may partially implement higher levels." The OpenFormula spec embedded test cases into the definitions of functions and operators, as NORMATIVE (required) components -- intentionally so. We found that VERY valuable; the test cases could be immediately used to check on the implications of the requirements, and to help implementors make sure that they were implementing the requirements (and if not, which ones). The combination of text and test cases appears to be more powerful than either one alone. It's not clear that this approach would work in many other domains, but I think all participants found it helpful for spreadsheet formulas (since "getting the right answer" is so important in formulas). Non-normative explanatory text was included but marked specially, so that the document could be printed without it. This text included hints on how to properly implement various parts of the spec; such aids should increase the likelihood that implementations will be interoperable. A quick note on terminology: The term "function" was used for anything that syntactically was called like this: FUNCTIONNAME(...). The infix, prefix, and postfix operators are all termed "operators", though other than their special syntax the operators work like functions. Here's the key text on Types: "In OpenFormula a value MAY have one of the following basic types: Text (the string type), Number, Logical (the boolean type), Error, Reference, or Array. At lower levels the Logical type MAY be implemented as a subtype of Number, as described below. At higher levels complex numbers MUST be supported as described below. An implementation MAY provide other types, and it MAY have many specialized subtypes of these types. However, a document MUST only use constructs in this specification (including only these types) to conform to it." The last sentence simply means that if you use some nonstandard construct (e.g., datatype or function), you shouldn't be surprised if it won't work on other implementations. It might be useful to define a term like "well-formed" (it's syntactically fine, but the data type or function is not standard), but that wasn't in the OpenFormula spec. --- David A. Wheeler
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]