OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office-formula message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Meeting Minutes: Office Formula Kickoff Teleconference of March 2,2006

Meeting Minutes: Office Formula Kickoff Teleconference

On March 2, 2006, 1700-1800UTC (1200-1300EST),
the OASIS Office Formula Subcommittee (SC)
had a "kickoff" teleconference.  Below are the minutes of that meeting.
This is mostly a reprint of the earlier draft; there were no
corrections posted.  It's also being posted to the ODF TC
mailing list, so that they can follow the progress of the
formula subcommittee.

In summary:
* The group agreed to work almost entirely electronically,
  and NOT use a teleconference unless one is specifically called.
* The group agreed to use OpenFormula as a base document.
* A draft syntax is due May 2 to the TC, with a final draft due to the
  TC by September 2. There's some concern that this is ambitious.
* The SC will identify some sort of "packaging" so that applications
  don't need to implement "everything", yet users can easily
  determine what ports.  Details TBD.
* The spec is ONLY an interchange format; there will be NO mandates
  on the user interface.
* Most agreed on including test cases, as normative; they're valuable
  as clear definitions and for running on implementations.
  Where possible they should be locale-independent (unless they're
  SUPPOSED to test a locale); use functions like DATE() to accomplish this.
* There was general agreement that a Wiki would be very useful.
  As of this time the OASIS Office TC Wiki is still not up,
  so we continue to work via email.
* There was general agreement that implementations can add their own
  types, beyond whatever is specified.

==================== Details =========================================

Members attending (along with their organization and physical location) 
* Bastian, Mr. Waldo. Intel Corporation. Portland, OR, USA.
* Bricklin, Mr. Daniel.  ODF. (Near) Boston, MA, USA.
* Edwards, Gary. ODF. Redwood City, CA, USA.
* Faure, Mr. David. KDE e.V. France.
* Goldberg, Mr. Jody.  Novell. (OpenOffice.org AND Gnumeric). Toronto, 
* Kernick, Mr. Richard. ODF. London, UK.
* Metcalf, Dr. Thomas. ODF. Boulder, CO, USA.
* Rathke, Mr. Eike. Sun Microsystems.  Hamburg, Germany.
* Weir, Mr. Robert. IBM. Westford, MA, USA.
* Wheeler, Mr. David. ODF.  (Near) Washington, DC, USA.

Not present:
* Carrera, Mr. Daniel.  ODF.  UK.
* Mecir, Mr. Tomas. ODF. Slovakia.

Spreadsheet implementations represented (by person or by person's 
organization) included at LEAST the following (alphabetically): Gnumeric 
(GNOME), IBM Workplace, KSpread (KDE KOffice), Lotus 1-2-3 (SmartSuite), 
OpenOffice.org Calc, StarOffice Calc, wikiCalc (and technically VisiCalc 
through Dan Bricklin, though that is no longer being maintained).

This is a subcommittee, not an OASIS Technical Committee (TC).
However, if we were a TC, we would have easily had a quorum.
David A. Wheeler (chair) moderated the teleconference.

The group first discussed if anyone wanted to work by teleconference,
or if everyone was happy working "electronically" (email mailing list,
web site, and maybe a Wiki).  All agreed to work primarily
electronically.  Waldo Bastian noted that there MIGHT be a special
reason for us to meet (rarely) by teleconference; all agreed that
anyone could propose through electronic means to have a teleconference,
and that we would have a teleconference if there was general agreement
to do so.  But unless there is a specific request to do so, we
will work electronically (e.g., mailing list / web site / Wiki).
(Wheeler will tell his management that there's no need to budget
for a weekly teleconference; he guesses we'll have at MOST 2 more.
We had great difficulty finding a common time for the kickoff,
given busy schedules and a wide range of timezones.)

The group then worked through the proposed agenda
(the email labelled "Key Issues" on the mailing list).
Below is the original "Key Issues" text (after the "*"),
followed by some of the key discussion points in the teleconference.

* "Should we use OpenFormula as a base specification? Wheeler proposes 
yes; it's easier to start with a document and make changes than to start 
with a blank page.  We don't need to decide this at the kickoff, but we 
need to decide soon, say by March 10.  Between now and then, please read 
the OpenFormula draft. The question isn't whether or not you believe 
every word (it WILL change), but whether or not starting with it is 
better than starting with a blank page.  Wheeler can give a brief 
overview of OpenFormula via the mailing list.  Are there any major 
questions about OpenFormula?"

Earlier Wheeler had posted to the mailing list a summary about 
OpenFormula, and the specification itself has been on the OASIS website 
for some time.

There were no objections in the teleconference, and
many people on the mailing list have already expressly agreed to this.
There were no objections on the mailing list
by the end of March 2, 2006.  Thus, the group will use the
"OpenFormula" specification as a base document by unanimous consent.

* "Schedule: We need to define syntax, then semantics.  Proposal: Draft 
syntax defined by start+2 months; final syntax/semantics (inc. function 
definitions) defined by start+6 months.  Should "start" be March 2?"

Dan Bricklin: We'll start, and see if this is a reasonable schedule.

Wheeler: I intended this to measure the time that the SC would have
to create a submission to the TC.  The TC would then have its own

Dan B: There are many open-ended tasks.  For example, you can always
add more test cases; how do you know when to end?
Also, there probably needs to be a way
to fill in "more complete" material later, e.g., need an easy way
to extend test suite.

Wheeler: We should define a minimum criteria for test cases.
The OpenFormula rule was "at least one test case for every function and
every operator", since this at least determined if a particular
function or operator was present.

Dan B: We should have more coverage than that.

* "Issue: Goals/range.  Rob Weir and David A. Wheeler both want it to 
support both limited resources (e.g., tiny PDAs) and massive 
capabilities.  Yet if everything is optional, no one will know what is 
supported, and users won't have real interoperability.  A solution for 
this problem is to define "levels" (OpenFormula did this).  Do we agree 
that we should have multiple levels?
** If we have levels, must define the levels.  Discuss briefly whatwe  
want, what are the levels.  Can we try to create rough resolution by 
March 17 on the levels, if we agree to define levels? (OpenFormula had 
four levels; we don't need to have 4, or use its breakdown, unless we 
like it.)"

Wheeler: There seems to be an emerging consensus to define "packages", 
including some predefined higher-level sets.  We won't require 
supporting any particular package (allowing implementations to be 
specialized), but having predefined packages will make it easy to 
express when an application DOES implement a common set.

Rob: This is not a critical thing to decide yet.  We could specify 
things at the level of OpenFormula.  Use separate section numbers or 
items numbers, if an implementation might implement one and not another 
-- make specification modular.

Jody: This is a good idea, but it takes us down a dangerous path.  There 
is a risk of having a set for every implementation, e.g., the "Excel 
set", the "Gnumeric set", etc.  Must NOT have a profile for each 

Wheeler: Agree. Think this can be handled.

Rob: The combination of data types + functions different for each.
How does implementation determine what capabilities are needed?
With eval, etc., the set of functions supported may even be dynamic!
Should be a way to declare "what I need", and allow run-time detection.
CAN just enumerate used functions -- the real problem is what will
an implementations do with an unknown function?

Wheeler: Treat failure as implementation issue?

Dan Bricklin: Make it easy for implementations to grow capabilities.

?: It's a lot like a link error.  How handle if external file not found?

Dan: Any function may have an error.  Also, what if it returns
a type not handled?

Jody: Agree to a point, but it must be able to export back out.  On 
import, must be able to include a placeholder for export... this is an 
implementation issue, it wouldn't be represented in the canonical 
format.  What about different versions of same function?  Need a naming 
convention (naming schema) for versioning a function.

Jody: I talked with Microsoft; their users requested that the stored 
format look SIMILAR to the displayed format.  That's why they switched 
to A1 format, EVEN THOUGH R1C1 format compresses better and supports 
wide columns more easily.  So rather than formulas having naming 
convention, near front, have a mapping: 'SUM' means 'SUM version 2', etc.

Dan: Something like that, if there's no table it just does it in-place.  
So we could support it.

Wheeler: I talked with Microsoft earlier, encouraging them to join.  One 
possibility is having "syntactic skins" (e.g., a syntax for ODF and a 
variant syntax for MOOX).  Generally the storage format can be similar 
to the display, though having a special marker for cell references 
([...] in OpenFormula) is very helpful to disambiguate between cell 
references and named expressions.

Jody: I want to stress round-trip persistence.

Wheeler: Agree.

* "Scope: Define this specification as ONLY an interchange format, and 
at most RECOMMEND user interface issues?  Wheeler recommends defining 
the spec as ONLY an interchange format.  Spreadsheets vary widely in 
user interfaces: parameter separator (comma vs. semicolon), function 
names (displayed name often varies by locale), number syntax (what 
locale?), equal-to operator (= or ==), intersection operator (" " or 
"!"), and so on.   The key is data interchange, not presentation, so 
Wheeler thinks we should work on defining how it's EXCHANGED as the scope."

Dan: Definitely should NOT require a UI.

Tom: I agree.  Do we even want to recommend them?  Probably not.

Richard K: Agree.

Jody: Format should be for storing, NO specific ties to what's displayed.

Rob: We'll have a processing model, but NOT how it's edited.

In short, there was universal agreement that the specification should 
ONLY cover the storage format and processing model, NOT the user interface.

* "Test cases: Should we include test cases in the spec? Wheeler 
STRONGLY recommends it.  Including test cases eliminates many problems 
of ambiguity in the text; Wheeler believes it is VERY difficult to write 
unambiguous text, but that well-written text accompanied by some clear 
test cases can together create an unambiguous specification that is 
easier to create and to read.  In addition, including test cases makes 
it much easier to test and assure compliance in implementations.  
OpenFormula did this successfully, and the KSpread developers seemed to 
find the test cases useful."

Tom Metcalf: Strongly agree; normative test cases are very important.  
For formulas, makes sense for normative.  For other cases wouldn't make 

Jody: Thoroughly agree, test cases raise issue.  Problem - if there's a 
subset, you may not be able to run the test cases.  What if SUM on date, 
but some applications don't support dates - how handle that?  What if a 
test needs a function just to do the test?

Wheeler: Without ABS and <=, it's hard to test many of the numeric results.

?: It's reasonable to assume that most implementations have those.

Waldo: Some test cases will require multiple (packages).

Dan B: Test cases have two purposes - (1) run, and (2) express.  Even if 
you can't run it, it does make things clear.

Rob: Concerned with it being normative.  
Need to describe "what does it mean".  Do NOT want to depend on locale.

Dan: This gets to types vs. displayed values.  (The specification should 
worry about values, NOT how they're displayed.)

Wheeler: An example is DATE() in OpenFormula.   The first drafts of 
OpenFormula used strings to represent dates, but this didn't work well 
(not all spreadsheets accept ISO 8601 format, and most other formats are 
locale-specific).  So instead, we converted all tests involving dates to 
simply compare them as being "=" or not with the result of a DATE() 
function.  By using other functions to generate values of a particular 
type, data representation issues disappear, and it becomes (usually)
independent of locale.

Waldo: Must be sure to express the results. Everything True/False?

Wheeler: Or Number.  Even in right-to-left languages, arabic numbers are 
written left-to-right (to my knowledge).

Richard: Specify correct results in terms of FILE FORMAT, not in terms 
of display (which would vary by locale).  Then it doesn't matter
if it's True/False, Number, string, etc.

Jody: We need a "deprecated" mark?  E.G., the *B functions like LENB, 
and CHAR; these functions are fundamentally very locale and 
implementation-dependent (e.g., they depend on the locale and various 
settings), but some may at least want to know ABOUT them (and why they 
shouldn't be using them).

* "Discuss use of Wiki.  Do we want to try to put stuff in a Wiki and 
LATER transition text to ODF? Transition to an ODF document NOW?  
Transition some text now (e.g., what's in OpenFormula), use Wiki, and 
transition incrementally?  One issue: The Wiki must be MoinMoin, and 
it's unclear if OASIS will install the MoinMoin math formula support.  
Without formula support, formulas may be harder to create."

Rob: Key issue - formula support.  Let's not decide until can see if 
they support it.

Dan: We should use a Wiki as much as possible.

Rob: We need a Wiki while working... once it's more static, move to ODF.

(many): MUST make sure that the Wiki can ONLY be edited by members.

Wheeler: OpenFormula Wiki was only editable by members, very doable.

{We didn't have time in the teleconference to discuss
syntax, semantics, and complex numbers... but these are already being
actively discussed in the mailing list.  However, the issue
of complex numbers raised a larger question about data types, below.}

Dan Bricklin: What about handling open-ended types? (Complex, images,
HTML, etc.).  How can we make sure people know how to extend?

Jody: Beautiful idea, but that's a pipe dream.  When a complex type gets
into a complex numeric calculation, the complex type gets lost.
Fortran routines generally throw them out.

Dan: Yes, the type may not be supported.  So we need to figure out
the subset of functions that it [an application?] supports.

?: Need ground rules for extensible types, type promotion, etc.

Dan: We need to realize that we'll probably have other types.

?: We can avoid talking about data types.

Dan: We need to be explicit about keeping type information.

Wheeler: We should be clear that other types ok; the OpenFormula 
specification specifically stated that implementations MAY add other 
types, though its text could almost certainly be improved on.

Dan: Need to keep subtype info if possible.

There was then a brief discussion about IF, and about scalars vs. arrays.

The meeting ended a little after 1800UTC.

--- David A. Wheeler

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]