office-formula message

Subject: Re: (fwd) Should OpenFormula BASE() and DECIMAL() definitions listcharacter set?

From: robert_weir@us.ibm.com
To: "David A. Wheeler" <dwheeler@dwheeler.com>, Eike Rathke <Eike.Rathke@sun.com>, Michael Brauer <michael.brauer@sun.com>, OASIS ODFF SC <office-formula@lists.oasis-open.org>, patrick@durusau.net
Date: Wed, 6 May 2009 14:33:45 -0400

I checked with our local Unicode guru, to make sure we were expressing 
this right.  He confirmed that it is correct to refer to the "value space" 
as Unicode "characters" and "strings", and the serialized versions as 
"encoded characters".

I'm still working on the "abstract machine" model for OpenFormula.  But I 
wonder if we can say something like:

-----------------------------

A formula evaluator is an abstract machine that takes a Unicode string as 
input and returns one of the following as output a single value, or a 
2-dimension matrix of values of the following types:

1)A Unicode string
2)A floating point number
3)An error value from those enumerated in section x.y.z

---------------------------------

I assume we need to allow the return of a matrix to support array 
formulas.

Note that we say nothing here about how things are displayed.  This is all 
in the abstract.  How things are displayed is up to the spreadsheet 
specification, and is dependent on the underlying cell format, e.g., date 
versus currency versus numeric.   If we wanted to, we could have a richer 
type system, including explicit date and Boolean types.  Although 
applications do not explicitly have those types evident in the UI, it may 
be simpler to define OpenFormula in a way that has them as types in the 
abstract machine.  If we did that we would need to define implicit 
conversions between types.  If we don't, we would need to define how 
numbers are converted to dates.  Same thing, different formalism.... I 
think...

We would also need, if we go down this road, to define the capabilities of 
the abstract machine in terms of basic mathematical and string operations 
it can support, such as basic arithmetical operations, numeric 
integration, string, etc. We would then ensure that each of the 
spreadsheet functions was defined in terms of the enumerated operations of 
the abstract machine.  This doesn't need to be a huge amount of work.  I 
think a scan through the formulas would give us a list of primitive 
operations that abstract machine would need to support.

Once you have all that, then we define conformance:

----------------

"A conforming formula evaluator shall return results that are identical to 
that defined by this specification, or for floating point return values, 
results that are within the required tolerances"

----------------

Something like that.  The idea is to make the conformance definition be 
based on matching the outputs of a defined abstract machine. 

I'd be interested in opinions on this.  Is this a total waste of time? Or 
does would this approach appreciably strengthen the specification, 
especially in the areas that Patrick was concerned about?

-Rob

"David A. Wheeler" <dwheeler@dwheeler.com> wrote on 05/06/2009 02:09:48 
PM:
>
> 
> robert_weir wrote:
> > I'm cc'ing David Wheeler, who is chair of the formula subcommittee, 
and 
> > Eike Rathke, who is editing OpenFormula (or at least will when he 
returns 
> > from vacation).
> ...
> > In any case, I agree with your observation, that the range 'A'-'Z', 
> > without stating an encoding is locale-dependent and this is a problem 
we 
> > need to fix.
> ...
> > I suggest we explicitly state, in the introduction to the 
specification, 
> > that all string and character literals in OpenFormula are in the value 

> > space defined by Unicode.  Then we can refer unambiguously to U+0041 
('A') 
> > through U+005A ('Z') and U+0030 ('0') through U+0039 ('9').
> 
> Great solution.  Any objections?  I think we should do this.
> 
> --- David A. Wheeler

Follow-Ups:
- RE: [office-formula] Re: (fwd) Should OpenFormula BASE() and DECIMAL() definitions list character set?
  - From: "Dennis E. Hamilton" <dennis.hamilton@acm.org>

References:
- Re: (fwd) Should OpenFormula BASE() and DECIMAL() definitions listcharacter set?
  - From: "David A. Wheeler" <dwheeler@dwheeler.com>