Subject: Re: (fwd) Should OpenFormula BASE() and DECIMAL() definitions listcharacter set?
I checked with our local Unicode guru, to make sure we were expressing this right. He confirmed that it is correct to refer to the "value space" as Unicode "characters" and "strings", and the serialized versions as "encoded characters". I'm still working on the "abstract machine" model for OpenFormula. But I wonder if we can say something like: ----------------------------- A formula evaluator is an abstract machine that takes a Unicode string as input and returns one of the following as output a single value, or a 2-dimension matrix of values of the following types: 1)A Unicode string 2)A floating point number 3)An error value from those enumerated in section x.y.z --------------------------------- I assume we need to allow the return of a matrix to support array formulas. Note that we say nothing here about how things are displayed. This is all in the abstract. How things are displayed is up to the spreadsheet specification, and is dependent on the underlying cell format, e.g., date versus currency versus numeric. If we wanted to, we could have a richer type system, including explicit date and Boolean types. Although applications do not explicitly have those types evident in the UI, it may be simpler to define OpenFormula in a way that has them as types in the abstract machine. If we did that we would need to define implicit conversions between types. If we don't, we would need to define how numbers are converted to dates. Same thing, different formalism.... I think... We would also need, if we go down this road, to define the capabilities of the abstract machine in terms of basic mathematical and string operations it can support, such as basic arithmetical operations, numeric integration, string, etc. We would then ensure that each of the spreadsheet functions was defined in terms of the enumerated operations of the abstract machine. This doesn't need to be a huge amount of work. I think a scan through the formulas would give us a list of primitive operations that abstract machine would need to support. Once you have all that, then we define conformance: ---------------- "A conforming formula evaluator shall return results that are identical to that defined by this specification, or for floating point return values, results that are within the required tolerances" ---------------- Something like that. The idea is to make the conformance definition be based on matching the outputs of a defined abstract machine. I'd be interested in opinions on this. Is this a total waste of time? Or does would this approach appreciably strengthen the specification, especially in the areas that Patrick was concerned about? -Rob "David A. Wheeler" <email@example.com> wrote on 05/06/2009 02:09:48 PM: > > > robert_weir wrote: > > I'm cc'ing David Wheeler, who is chair of the formula subcommittee, and > > Eike Rathke, who is editing OpenFormula (or at least will when he returns > > from vacation). > ... > > In any case, I agree with your observation, that the range 'A'-'Z', > > without stating an encoding is locale-dependent and this is a problem we > > need to fix. > ... > > I suggest we explicitly state, in the introduction to the specification, > > that all string and character literals in OpenFormula are in the value > > space defined by Unicode. Then we can refer unambiguously to U+0041 ('A') > > through U+005A ('Z') and U+0030 ('0') through U+0039 ('9'). > > Great solution. Any objections? I think we should do this. > > --- David A. Wheeler