[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [office-formula] Test Cases
I'm uneasy about that description of the situation because, although it describes one of many states-of-affairs, it doesn't specify anything. It also doesn't indicate how one might get a handle on limitations of computational precision in the arithmetic operators and the functions that necessarily deliver approximations and are also subject to computational limitations. I think there are two intertwined matters here. One has to do with what the data types are and the different levels of abstraction and representation that we need to have our heads around when we talk about a Number, a String (or Text), etc. Sticking to questions of test cases I have three immediate concerns: 1. With regard to test cases in the specification, I find that I am seriously ambivalent (more about that after my signature). I don't believe there should be test cases in this specification and they certainly should not have normative significance. Normative observations about how well a computation behaves and how consistent the computational behavior is in a formula evaluator need to be expressed some other way. In particular, one should be able to know what is needed without seeing test cases. 2. With regard to the use of indicators for approximation error, the little epsilons that are shown, these are not informative. For example, we don't account in any way for how epsilons on the input bears on epsilon of the result. E.g., for very small values of x, approximations for sin(x+epsilon) will return x+epsilon for epsilon measurable but smaller in magnitude than x. On the other hand, if we are talking about exp(x+epsilon) the result will be something like exp(x)*exp(epsilon) with whatever epsilon2 there is on the result itself. I use these examples because they are easy to visualize mathematically. There are similar situations around use of interest rates, results that will be made into exact currency values, and so on, and the handling of ratios is generally a source of surprises. I am not proposing a solution to this, but we need to not be foolish about it. It would be more useful, to me, to see examples that demonstrate what sort of epsilon arises in a given implementation or use case so that it could be noticed and folks (both implementers and expert users assessing implementations) would not be surprised. 3. When we talk about test results, I wonder WHAT VALUE ARE WE TALKING ABOUT? WHAT VALUE ARE WE LOOKING AT? When an operation (or a literal number in a formula) delivers an intermediate result to another operation, this may be quite a different matter from when a persistent result is retained as the value of a cell of some sort. We then need to distinguish what that value may be (as it will be delivered into the recalculation of some other formula or cell) versus (1) how that value is persisted for communication as a cell value or result in a document format and (2) how that value might appear as a viewable result (displayed or printed) after formatting rules and other conversions (such as implicit rounding) are applied. I suspect, in this case, we are concerned with the evaluation result that a formula-hosting implementation is sitting on. But in that case, one assesses that value indirectly, usually with other formulas, not by casual inspection of the way the result is shown by default in a cell-value display. - Dennis PS: As much as I don't think test cases of the kind we have should be in the specification and that "test cases" should not have standalone normative significance, I yearn for some sort of demonstration formulas that allow important characteristics of a function to be demonstrated. I don't know where these should be found and what the accompanying guidance should be. On OIC I said "I am thinking of simple relationships and identities that should be apparent (or not) on inspection of certain cases, cases that reveal actual deviations resulting from calculation and number-representation limitations, and handling of [near-]edge cases and cases generally though of as undefined or out-of-bounds." I still think that would be extremely valuable. Not sure where it goes, if anywhere, in respect to the OpenFormula specification. -----Original Message----- From: Eric Patterson [mailto:ericpa@exchange.microsoft.com] Sent: Monday, January 18, 2010 16:25 To: office-formula@lists.oasis-open.org Subject: [office-formula] Test Cases I'm starting to go through the open JIRA issues related to test cases. My goal is to ensure that the normative text accounts for any issues that have been reported against the related test cases. Many of the test cases include an Epsilon value showing the acceptable range of return values. Before dismissing this aspect of the test cases, we should make sure that we are all in agreement that this is adequately covered in the written spec. I believe that the below part from section 3.3.1 (In Patrick's latest upload) covers the precision issue. If this can be added to the agenda for tomorrow's meeting we can see if we have consensus. Thanks, Eric Number General A number is simply a numeric value such as 0, -4.5, or $1000. Numbers shall be able to represent fractional values (they shall not be limited to only integers). The "number" type may be displayed in many different formats, including date, time, percentage, and currency. Typical implementations implement numbers as 64-bit IEEE floating point values and use the CPU's floating-point instructions where available (so intermediate values may be represented using more than 64 bits). However, implementations have great freedom in how they implement Number, and may use representations with a fixed bit length or a variable bit length. A cell with a constant numeric value has the number type. Note that many formula creators are not sophisticated in their understanding of how computers determine their results. Many users, for example, do not understand computer floating point arithmetic models, and have no idea that many implementations use a base other than 10 (or what that would mean). In particular, many implementations use base 2 representations, with the result that value 0.1 can only be represented imprecisely (just as 1/3 can only imprecisely represented in a base 10 decimal representation). This problem applies to all uses of typical computing equipment, including nearly all programming languages, and this specification does not attempt to fully resolve the problem of unsophisticated users. Rationale: Originally some efforts were expended to try to make formulas produce the "expected answer" for unsophisticated users. In particular, the equal-to operator for numbers matches imprecisely in many applications, because many users do not understand that (1/3)*3 on most implementations will produce a value close to one but not precisely equal to one. Originally there was a test to ensure that (1/3)*3 was equal to 1. The Gnumeric developers objected, on the grounds that requiring that equality be "sloppy" made it very difficult for sophisticated users to use spreadsheets to their full capabilities. In contrast, the function INT still requires that INT((1/3)*3) is 1, because if INT does not do so, many user's spreadsheets will not work as they expect. The expected answers of INT may not make numerical analysts happy, but users will get what appears (to them) to be wrong answers otherwise. --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]