office-formula message

Subject: Re: [office-formula] Semantics

From: robert_weir@us.ibm.com
To: office-formula@lists.oasis-open.org
Date: Wed, 1 Mar 2006 08:34:26 -0500

This is an interesting conversation. We have some of these same debates in designing programming languages -- strict typing with compile time enforcement, or highly-polymorphic, late binding? What is best for productivity? What is best for reducing errors? What is best for solitary user=author, versus large complex sheets with multiple authors and business critical uses? Spreadsheet errors can be very costly, but most spreadsheets are very simple and have a short lifetime.

I wonder how much of what we have today is due to historic circumstances? When spreadsheet came out, there were not so many capable coder/end-users. Application development was a long drawn out and expensive process. It still can be that way, but there are now a larger number of tools based on easy-to-use scripting engines which bring coding within the reach of more people. The typical 1-2-3 users back in 1990 might have been 30 years old but was having their very first experience with a computer. I remember a Computer Languages magazine survey that showed that the 1-2-3- macro language was the most-used programming language in the world.

But look at today -- The typical 14 year old knows more about computers than adults knew back then. I assume most high school graduates today know how to program in at least one high level language. My niece is graduating high school in a couple of years with MS Office certification. She already knows how to program Access better than I do. If we were to look forward at the next 20 years of uses and users for spreadsheets, would we be lead to make a different set of assumptions in terms of what idioms would be familiar to them, and how much of a tolerance they would have for doing type conversions explicitly?

I guess my point is that a lot of the syntactic and semantic complications that made it into 1-2-3 and Excel came from a need to support a type of novice user which we may never see again.

-Rob

Eike Rathke <erack@sun.com> wrote on 03/01/2006 08:07:36 AM: > Hi Tomas, > > On Wed, Mar 01, 2006 at 09:45:01 +0100, Tomas Mecir wrote: > > > > > First of all, my fundamental principle: as far as the user is > > > > concerned, there are no datatypes. Everything automatically gets > > > > converted to what is expected. > > > This is exactly the hell in which the approach of the one big player > > > brought us.. > > > > Really ? What kind of hell ? > > The hell of strings being sometimes interpreted as numbers and sometimes > not. The hell of locale > > > I don't see any problem with this approach - after all all those > > "high-level" programming panguages do the same ... > > Which ones do what? > > > Or, do you see any reason why strings shouldn't be treated as numbers > > for values like "3", other than performance reasons ? > > My favorite examples: The values of "12'345.67" and "12 345,67" and > "一二、三四五．六七" and "１２，３４５．６７" are exactly what? > > > > > Now consider that you pass the document to a colleague who works with > > > a different locale. Maybe it breaks. > > > > How would you do these things, then ? > > You better don't. Numbers are numbers, and text is text. User input is > parsed according to the input locale, after that text content shouldn't > be reparsed. If the user designs a document the way formulas concatenate > strings to form numbers that have to be parsed in numeric context it is > his duty to get it right not to use locale dependent delimiters or to > not spread the document to someone who works using a different locale. > Else it may break. You can't prevent that under all circumstances. > > > Is it even -possible- to design things so that they're > > locale-independent, without majorly limiting functionality ? > > Nothing limits functionality if you strictly acted on numbers only in > numeric context. The user designing the sheet would notice from the very > beginning that something doesn't calculate the way he thought it would, > there'd be no surprise when somebody else working with a different > locale loaded the document. By allowing the > guess-what-it-could-be-but-maybe-it's-wrong mechanisms you limit > functionality. > > > I think OOo supports things like "3"+3 is 6, right ? > > This is locale-dependent as well ... > > And I wish it was never introduced. We might even remove it. But that's > just my personal opinion. > > > > > > I realize this may create problems, but it follows my fundamental > > > > principle of no datatypes. > > > Well, it's your decision, but then you'd have to write COUNTA to the > > > file format instead of COUNT. And you wouldn't be able to correctly read > > > a file that uses COUNT. > > > > If we decide that the fileformat should distinguish between COUNT and > > COUNTA, then yeah. You raise some valid arguments here, but I'm not > > convinced, as both solutions seem to have problems. Either restricted > > functionality, or problems with locale. Which is more important not to > > have ? Who can decide that ? > > What do the trillion documents in the wild use? There's your answer. > > > > > Bear in mind that AVERAGEA(range) does not convert strings to numbers, > > > it treats them as zero. This is a different functionality. > > > > Well in my opinion, it should attempt conversion, and treat them as > > zeroes only upon a failure. > > On failure it should generate an error instead. My opinion. > > > And see, another inconsistency. =(3+"Hi")/2 should raise an error, > > AVERAGEA on two values, "Hi" and 3, will yield 1.5 ... > > Yes, that's how AVERAGEA is defined. (3+"Hi") is undefined. Note that > AVERAGEA and other *A are often used where measurement readings are > missing and a comment is placed instead. So it may make sense to > evaluate these as zero in this context. If the comments are not to be > evaluated you use AVERAGE instead. It's as simple as that. > > Eike

Follow-Ups:
- Re: [office-formula] Semantics
  - From: Eike Rathke <erack@sun.com>

References:
- Re: [office-formula] Semantics
  - From: Eike Rathke <erack@sun.com>