[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [office-formula] Proposal: Drop "huge" group; any criticallymissing functions for "large"?
Andreas J Guelzow said: > I do find it disturbing that Excel and OpenOffice.org apparently > play a differnet role than the otehr spreadsheets. Ah, I understand why you'd say that, because of the point at which you joined. If you'd joined a few months earlier, you'd have asked why SheetToGo and wikiCalc play a different role. When we we work on "huge", you'll ask why Gnumeric and Quattro Pro play a different role :-). It's actually not that way; this is a side-effect of some earlier decisions. They can be revisited, but I think it'd be useful to explain why we are where we are. And it'd be a good idea to document what's going on here in the mailing list. So, here's my try, at least from my perspective. When we formed the group, had the kickoff teleconference, etc., it was noted by Dan Bricklin (and agreed to by all) that different applications target different markets/customers. WikiCalc is not Gnumeric; they have different target users. Thus, "one size fits all" requirements for functions, capabilities, etc., was unacceptable for a truly universal spec. If you're writing the specification for the interface of a single application (e.g., Excel), it's fine to have a simple list of you "must implement exactly this". But that's unacceptable if you want a REAL open standard for interoperability. Indeed, look at all the discussion on namespaces, so we can be sure that different applications can save their data in a common format; we're serious about getting many different actors on board. So we agreed that it'd be possible for applications to arbitrarily subset and superset the spec. That's very flexible, but it leads to the next problem... how can I create spreadsheet documents and know that other applications can use them? If every application implements nonoverlapping sets, in the worst case nothing can interoperate. Not okay. The proposed solution is to document _groups_ (formerly levels) that applications can assert that they conform to, and that portable documents can say they need. Applications don't have to implement a group, but many will (if we provide reasonable groups to choose from). There is no perfect way to determine a group; the best way I know of is to look at existing implementations, and appeal to this group to make tweaks. The current main groups are small, medium, large, huge; the names are based on the number of functions they include, and are intentionally designed to separate different markets: * Small is intended for PDA-sized devices and/or those who want less development effort, yet still provide "all the common functions" (whatever THAT means). We have an exemplar: SheetToGo on Palm PDAs. The wikiCalc developers used that list and re-implemented that set of functions, demonstrating that it IS a reasonably implementable set without a massive time investment. About 100 functions. * Medium is an intermediate step between Small and Large, based on "what most applications implement" - so this takes ALL applications into account. This is a transitional group, for those applications moving from small towards large. This has around 200 functions. * Large is based on the typical spreadsheet implementations that are included in desktop office suites, such as OpenOffice.org's Calc, Microsoft Excel, and what I understand to be the current goal of KOffice KSpread. Around 300 functions. * Huge is based on the spreadsheet implementations that are designed to provide the best possible spreadsheet formula capabilities, including the support of many useful but highly specialized functions. These applications are often developed (at least originally) as stand-alone programs, not as part of an office suite (though they typically join one later). Gnumeric and Quattro Pro fit in this class; both have over 400 functions (Gnumeric has the most). Thus, when examining the functions for "large", we look especially at Excel and OpenOffice.org, because those are well-known examples of that market niche. It's not because they are "better" than all other apps. In particular, we want to make sure that existing users of Excel and OpenOffice.org can transition _to_ OpenFormula without losing use of their document files.... and in fact, we want them to be able to exchange files right away with other implementations. The issue isn't really "what do applications do", it's "what do people's spreadsheet documents depend on"? If many spreadsheet documents depend on something, we need to give them a way to exchange that information between applications. Rob Weir has correctly pointed out that using existing applications, or this body of people, to group functions is imperfect. True. But there's no higher authority we can appeal to, and producing a spec where formulas do NOT interoperate is not okay. So we'll use our collective best judgement, and I think the results will be quite good. And here's the problem: A lot people really need, at most, the "large" set. So I propose that we work in stages - let's get everything through the "large" set done, release the spec, and then work on what's needed for the "huge" set. The "huge" set includes the "large" set, so the current work (even omitting "huge") is still 100% relevant for Gnumeric developers and Gnumeric users. In particular, I expect that many spreadsheet documents created/modified by Gnumeric would be completely covered by the "large" set. > I would rather see many of the "nonsense" functions dropped > that are apparently implemented in all spreadsheets than > useful reasonable functions that currently are only > implemented in some. I like the way you think. And in some cases that may be what we should do. In others, maybe that's a poor approach. After, we need to make sure use have a transition approach; having a great train is only useful if people can get to the station. So let's discuss the best approach. What I hope we're doing is leading to the future, while reaching back and helping people get on board as necessary. Per an earlier message, perhaps the right way is to have a LEGACY group, or simply define BIN2DEC (etc.) without requiring them in any particular group. Sometimes, maybe we don't need to define these old legacy nasties at all; that's particularly true if they are essentially unused. Our point is to faithfully exchange spreadsheet DOCUMENTS, not to reimplement any particular app. Together, let's find a way to look behind AND ahead. > For example, is BIN2DEC really more important than > BITAND or SSMEDIAN. (SSMEDIAN calculates a standard median used > in the social sciences for discrete data with repetition.) ... > GNumeric has two functions called > BINOMDIST(x,n,p,FALSE) and R.DBINOM(x,n,p,FALSE) > that both supposedly do the same thing... >From a mathematical point of view BINOMDIST has serious issues, >for example: > BINOMDIST(0.5,10,0.2,FALSE) is 0.107 rather than 0 > BINOMDIST(11,10,0.2,FALSE) is an Error rather than 0 Thanks! Perfect! That was the point of my request. I don't think that the "large" set should include all of Gnumeric's functions, some of which are very specialized. But a few of those functions (at least their semantics) probably _do_ belong in a general-purpose office suite spreadsheet application. I had noted BITAND specifically in my message, as you can see. The semantic issues that you raise are absolutely critical, too. So let's identify the functions and semantic issues, and address them. That'll help everyone interchange spreadsheet documents. --- David A. Wheeler