OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office-formula message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Summary of OpenFormula

I promised to write a summary about the OpenFormula specification for 
those who weren't familiar with it, so below is my summary.

Background: We need to decide if we should use OpenFormula as a base 
document, or if we should start from a blank sheet of paper (we can 
CHANGE anything... we just need to know what to start from).  The people 
who I thought might object have said they'd be happy to use OpenFormula 
as a base document.  So tomorrow, I plan to ask if anyone objects or 
needs more time to study OpenFormula.  If ANYONE says they want more 
time (at the teleconference or by end of the day March 2) I believe they 
should have more time -- but if that's so, let's make a decision by 
March 10.  But if everyone's happy with OpenFormula as a base document, 
or doesn't care, then let's not wait for no reason; we have work to do.


Summary of OpenFormula:

OpenFormula is a draft specification for spreadsheet formulas, offered 
to OASIS as a contribution.  It is based on analyzing many different 
spreadsheet applications, in particular Excel, OpenOffice.org Calc, 
Gnumeric, KSpread, Lotus 1-2-3, Quattro Pro, and SheetToGo.  The syntax 
is derived from OpenOffice.org 2's storage format.  The semantics are 
mostly based on what is common between applications, though at higher 
levels it picks specific semantics based on Excel / Gnumeric / SheetToGo.

The "scope" section of OpenFormula describes its purpose: "OpenFormula 
is an open format for exchanging recalculated formulas between office 
application implementations, particularly for spreadsheets. OpenFormula 
defines the types, syntax, and semantics for calculated formulas, 
including many predefined functions and operations, so that formulas can 
be exchanged between applications and produce substantively equal 
outputs when recalculated with equal inputs. Both closed and open source 
software can implement OpenFormula.  OpenFormula is intended to be a 
supporting document to the Open Document Format for Office Applications 
(OpenDocument) format, particularly for defining its attributes 
table:formula and text:formula. It can be used in other circumstances 
where a simple, easy-to-read infix notation is desired for exchanging 
recalculated formulas."

The scope section also says "OpenFormula does not define:
    * the user interface. User interfaces may use different syntaxes, 
different function names, and/or different parameter orders.
    * the display format or controls over it.
    * internal representations.
    * a general notation for mathematics.
    * a full-fledged programming language. OpenFormula formulas 
calculate a single result and return it. By design, most operations and 
functions are free of side-effects, and it is not possible to loop forever."

Here is the outline of the document:
   1.  Scope
   2. Normative References
   3. Document Notation and Conventions
   4. Conformance - including test case conventions and test data
   5. Basic Concepts
   6. Types
   7. Expression Syntax
   8. Standard Operators and Functions.  This is subdivided further:
         1. Common Template for Functions and Operators
         2. Implicit Conversion Operators (Type Conversion)
         3. Standard Operators
         4. List of Functions
         5. Array Functions
         6. Database Functions
         7. Date and Time Functions
         8. Financial Functions
         9. Information Functions
        10. Lookup Functions
        11. Logical Functions
        12. Mathematical Functions
        13. Statistical Functions
        14. Text Functions
   9. Relationship with OpenDocument
  10. Miscellaneous
  11. Guidelines for Users (Non-normative)
  12. User Input-Output (Non-normative)
  13. Development of this Specification (Non-normative)
  14. Rules for Contributors
  15. License Terms
  16. Bibliography (Non-normative)

OpenFormula has "levels". A "level" describes the amount of 
functionality provided by an OpenFormula implementation or required by a 
given file containing formulas. There are four levels, where each level 
is a strict superset of lower levels:
    * Level 1: Minimum implementation. This level provides the minimal 
capabilities (including functions, types, and their meanings) that are 
very widely implemented with spreadsheet applications, even in 
resource-contrained environments. It includes 109 functions.
    * Level 2: Minimum desktop implementation. This level provides the 
minimal capabilities expected from typical desktop spreadsheets. Many 
current desktop implementations meet or nearly meet this level.
    * Level 3: Basic implementation. This level provides all the 
capabilities necessary for typical desktop spreadsheet use. Many current 
implementations meet or nearly meet this level. This adds a 
distinguished logical type and support for complex numbers.
    * Level 4: Full implementation. This level provides some additional, 
less-commonly used features intended for advanced users.

Note that this a little different than the emerging consensus in the 
office-formula group; many seem to be interested in having a 
finer-grained notion of "packages".  But the basic idea of making it 
possible to state simply WHAT an application implements, or a 
spreadsheet file requires, is the same.

The OpenFormula spec goes on to say:
"A conforming application MUST report the highest level it conforms to, 
if it claims conformance to this specification....  all conforming 
applications MUST meet the level 1 requirements to claim conformance to 
this specification. Applications MAY accept extensions as long as they 
do not interfere with processing data in the format defined here. In 
particular, applications MAY support additional formulas and operations, 
additional optional parameters for functions, or make certain function 
parameters optional when they are required by this specification. It is 
RECOMMENDED that applications clearly document their extensions in all 
user documentation, both online and paper, in a manner that users would 
be very likely to be aware when they are using a non-standard extension. 
Applications MAY partly implement a higher level of the specification."

The OpenFormula spec also says:
"This specification's text is written as a description of the 
requirements of an implementing application. However, documents can also 
comply (or fail to comply) with this specification. A conforming 
document MUST be correctly recalculable given only the specification 
given here. To claim conformance, a conforming document MUST report the 
lowest level it conforms to. A document with no formulas is notionally 
given level 0. Note that some applications MAY be able to read and write 
some documents of a higher level than their compliance level, because 
applications may partially implement higher levels."

The OpenFormula spec embedded test cases into the definitions of 
functions and operators, as NORMATIVE (required) components -- 
intentionally so.  We found that VERY valuable; the test cases could be 
immediately used to check on the implications of the requirements, and 
to help implementors make sure that they were implementing the 
requirements (and if not, which ones).  The combination of text and test 
cases appears to be more powerful than either one alone.  It's not clear 
that this approach would work in many other domains, but I think all 
participants found it helpful for spreadsheet formulas (since "getting 
the right answer" is so important in formulas).

Non-normative explanatory text was included but marked specially, so 
that the document could be printed without it.  This text included hints 
on how to properly implement various parts of the spec; such aids should 
increase the likelihood that implementations will be interoperable.

A quick note on terminology: The term "function" was used for anything 
that syntactically was called like this: FUNCTIONNAME(...).  The infix, 
prefix, and postfix operators are all termed "operators", though other 
than their special syntax the operators work like functions.

Here's the key text on Types: "In OpenFormula a value MAY have one of 
the following basic types: Text (the string type), Number, Logical (the 
boolean type), Error, Reference, or Array. At lower levels the Logical 
type MAY be implemented as a subtype of Number, as described below. At 
higher levels complex numbers MUST be supported as described below. An 
implementation MAY provide other types, and it MAY have many specialized 
subtypes of these types. However, a document MUST only use constructs in 
this specification (including only these types) to conform to it."
The last sentence simply means that if you use some nonstandard 
construct (e.g., datatype or function), you shouldn't be surprised if it 
won't work on other implementations.  It might be useful to define a 
term like "well-formed" (it's syntactically fine, but the data type or 
function is not standard), but that wasn't in the OpenFormula spec.

--- David A. Wheeler

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]