office-comment message

Subject: Re: [office-comment] Statistical SUMMARY() function

From: Sheldon Britton <sab_orion@verizon.net>
To: robert_weir@us.ibm.com
Date: Wed, 04 Jun 2008 15:33:26 -0400

Hi Robert,

Thank you very much for the timely response and the opportunity to raise the question.

From working experience I am now also taking the opportunity to forward documentation (see attachment) which describes the procedures for the return/computation of Economic Rate of Return (ERR) values as distinct from those of the return/computation of Internal Rate of Return (IRR) as defined in formula listings of OpenOffice's Calc, IBM's Lotus 1-2-3, IBM's Lotus Symphony's 1.0 SpreadSheet et cetera et cetera.

Re: "So how to turn an IRR into an ERR? I suppose the joke answer would be "Invest in sub-prime mortgages"; Wow... (was that "ENRON" or "sub-prime" ;-) !
Response: Which better company to be with other than the one (IBM) which now have the responsibilities for the Federal Housing Loans origination port folio; and, beyond these let us look at the "services opportunities" (delivery solutions...) in the world's energy and commodities markets (food...) among others in that IBM's "Lotus Symphony" like "OpenOffice" can be freely obtained. $$$ :-D

Also, "But seriously, my understanding is that Internal Rate of Return, as we define it in OpenFormula, is the same as what some practitioners call "Economic Rate of Return".

The attached documentation which describes "Economic Rate of Return" employed by the Inter-American Development Bank ( http://www.iadb.org ) I am quite sure will undoubtedly provide some clarification and further understanding about this issue.

All for your kind information.

Kindest regards,

Sheldon

robert_weir@us.ibm.com wrote:

OF9698F775.57DEBB7B-ON8525745E.006183E2-8525745E.0062BD10@lotus.com" type="cite">

Hi Sheldon,

I'm not familiar with an ERR() function meaning  "economic rate of 
return".  The @ERR() function in 1-2-3 merely returns the ERR error value, 
much as NA() returns the NA error value.

So how to turn an IRR into an ERR?  I suppose the joke answer would be 
"Invest in sub-prime mortgages".

But seriously, my understanding is that Internal Rate of Return, as we 
define it in OpenFormula, is the same as what some practioners call 
"Economic Rate of Return". 

-Rob

Sheldon Britton <sab_orion@verizon.net> wrote on 06/04/2008 11:15:05 AM:

Hello Robert,

Thank you for a most interesting email regarding the above mentioned
subject matter which served to remind me that I had some time ago 
submitted a query on this forum regarding the conversion from 
Financial Rate of Return (IRR) to Economic Rate of Return (ERR); 
and, here is the rationale for this on the basis of an extract from 
a 1998 Lotus Development Corporation communication:

Re: "Concerning the issues with 1-2-3 that are talked about in the 
documentation you gave me, most of the issues are related to 
converting files between older and newer versions of product and 
converting documents between Lotus and Microsoft. Anytime a file is 
saved backwards or saved with an older file format than the format 
the file was created under, such as saving a 1-2-3 , 97 file for 
Windows 95 into a WK1 format for DOS, then naturally we are expected
to loose certain features due to technology and features that are 
present now that were not present 8 - 10 years ago. Similarly, if we
try to convert a file from Lotus into Excel or Excel into Lotus, due
to differences in the products not every feature will be converted 
perfectly with the file filters that are available. Both Lotus and 
Microsoft create similar spreadsheet programs; however, there are 
several differences in both programs and these differences will 
remain to distinguish the products apart. We do try to design 
conversion filters that will allow as much of the file formats as 
possible to be exchanged and converted without disrupting the actual
file design and format.

In one of your letters you made mention of the @IRR and @ERR 
functions in the 1-2-3 product. By design the @IRR (notably "absent"
in Open Office) will calculate the Internal Rate of Return; where 
the @ERR is used in conjunction with other formulas, posted was an 
"ERR" showing an error was received in the calculations. As far as I
can see in the program I cannot find an @ERR function that will 
allow us to calculate an Economic Rate of Return"

Can you say to what extent can the question regarding the conversion

from

Internal Rate of Return (IRR) to Economic Rate of Return (ERR) 
within the framework of the "Open Formula 1.0..." be addressed at 
this time; and, if so what would be the strategies for implementation.

Thank you.

Best regards,

Sheldon A. Britton
Orion Resources International
(IBM Business Partner - PartnerWorld for Developers)

robert_weir@us.ibm.com wrote: 
HI Leonard,

Thanks for the proposal.

Another way of accomplish this might be something in the middle.  So, 
imagine a application MENU choice that instead of calculating the values

and pasting them in as static values -- an approach with the liabilities

you mention -- have the Menu option paste in the underlying formulas. So

a menu option that pastes in a range/block of formulas.  This approach 
might be applicable in other places as well: a formula block template, 
where you have a group of related functions that you want to paste in at

once, but have a single dialog box for filling out the common

parameters.

This would make it more of an application UI feature than a format 
question.

In general, OpenFormula 1.0  is not trying to extend the conventional 
spreadsheet formula model used by Excel, etc.  We're trying to create 
something that is well-documented and compatible.  However, once we have

that core, I am quite open to the idea of extending it in the future to 
create more specialized functionality. 

My preference would be to treat this holistically.  For example, we take

it as a task to improve ODF for use in engineering.  We create a 
subcommittee to investigate this topic, talk with engineers, find out

what

is needed, etc., and the results come back that spreadsheet needs some 
more formulas, more chart types, and presentations need the ability to

use

mathematical notation.  There could be similar efforts for medical, 
academic, financial, etc.  But it will work best where we an alignment

of

interested experts willing to help specify the requirements, as well as 
vendors or open source projects willing to implement these features.

-Rob

Leonard Mada <discoleo@gmx.net> wrote on 06/02/2008 06:42:36 PM:


Dear list members,

One of the most used features in spreadsheets is to analyze and 
summarize the data. This is used in academia and businesses alike.

In both instances, spreadsheet users often need to perform an 
exploratory data analysis and in one of the first steps they will 
summarize their data.

I discussed this issue more fully at:
http://www.openoffice.org/issues/show_bug.cgi?id=89980

While Calc (and ODF) has (most) functions implemented as separate 
funcitons, it lacks a unified function to compute all parameters at 
once. MS Excel does come with such a 'Menu'-entry in the advanced 
statistical toolpack.


A.) MENU vs FUNCTION
====================
Excel did implement a menu-command. Unfortunately, IF the data-changes, 
the computed statistics won't change and need to be manually updated. 
The user might not notice that the data has changed.

Therefore I opt strongly against the Menu-command and for a 

proper-function.

Also, using menu commands is not transparent, especially IF the 
spreadsheet will be later used by someone else or if the user revisits 
the spreadsheet a couple of months later.


B.) COMPARATIVE SOFTWARE ANALYSIS
=================================

1.) MS Excel: does have a SUMMARY-Menu command in the advanced
    statistical package (but no separate function)

2.) professional statistical programs:
    e.g. R (http://cran.R-project.org) has a summary() function
    [and some other summary-type functions, like fivenum()]

C.) UTILITY
===========
Both academia (research) and businesses alike would benefit from such a 
feature.

Users will be able to easily get all the relevant summaries. Summaries 
will be formatted as for publishing / presentation (see options later).

D.) REASONING
=============
Despite my desire to teach people correctly summarizing the data, and 
compute the descriptive statistics, I have to admit that this is an 
illusive goal. It is much easier to have a function that computes 
everything automatically without any user-interaction, than hope for ALL


these people learning descriptive statistics.


D.) IMPLEMENTATION
==================

1.) Function-Name: obviously: SUMMARY()

2.) Parameters: (range; option)
  - range: the data range in the spreadsheet
  - option:
     -- default: "BRIEF" OR 0: compute most important summaries
     -- "DETAILED" OR 1: compute advanced summaries
     -- "PUBLICATION" OR 2: compute summaries and format as for
         publication in a professional journal
     -- "TABLE" OR 4: compute summaries as for publication,
         BUT display them in tabular format (for inclusion in a table)
     -- some other: ... lets leave the feature-list open
  - results:
     -- an array with the important summaries

IMPORTANT: This is an array function!

REASONING for Options:
 - most of the time, users need only some basic descriptive statistics.
 - I am against displaying ALL possible parameters, when most of them
   are of NO use.
  [this is descriptive statistics]
 - the 'DETAILED' option will print more advanced statistics.
 - the 'PUBLICATION' OR 'PRINT'-option will additionally format the 

results

   as for print, so that the user can simply COPY/PASTE the results
 - the 'TABLE' is as the 'PUBLICATION', but in tabular format

3.) COMPUTED OUTPUT
===================
 a.) the function shall verify IF the data is:
  -- numerical non-binary
  -- binary (numerical or categorical)
  -- categorical, NON-binary
  -- obviously, it can't detect ordinal values

 b.) it should verify IF the first row in the series is a LABEL
  -- it should skip any eventual labels

 c.) computed values should be:

 NUMERICAL DATA
   BRIEF DESCRIPTIVE STATISTICS
     min
     first quartile
     median
     third quartile
     max
     mean
     standard deviation
     median absolute deviation of the median (MAD)
      -- has the best possible breakdown point
    (I deliberately omit the mode; NO strong opinion about variance)

   ADVANCED DESCRIPTIVE STATISTICS
     ALL BRIEF PLUS additionally
     variance
     mode (well, lets include it)
     standard error
     CV (coefficient of variation)
     percentile 2.5% and 97.5%
     [for the IPR95%]
     mean absolute deviation of the mean (ADM - or average deviation)
     [although this is better known as the previously mentioned MAD, it 

is

      actually a much less robust estimator than the MAD]
     very robust locationless estimators:
       Sn = 1.1926 * MEDi{MEDj|Xi-Xj|)
       Qn = QUANTILE( |Xi-Xj|; 0.25), where i<j
    [see http://www.jstor.org/pss/2291267 for details]

   PRINT
    - add text to the same cells and
    - group relevant parameters together
    - to allow easy COPY/PASTE:
    *median (range)* 'median (min-max)'
         -- the '...' are the actual values for these parameters
         -- the *...* are strings
         -- the range, IQR and IPR95% refer to the 2 separate values,
             NOT the difference
    *median (interquartile range)* 'value (value - value)'
    *median (IPR95%)* 'value (value - value)'
    *mean (SD)* 'value (value)'

    - both the TEXT and the values should be provided in the same cell,
      so as to allow easy COPY/PASTE
    - both 3 versions of the MEDIAN (...) should be provided
      [they serve different purposes and range / IQR / IPR95% are always
       grouped with the median]
    - I prefer 'mean (SD)' over mean +- SD, because the mean +- SD makes
      actually NO sense; it is mean +- 2*SD that makes sense

   TABLE:
    - as PRINT, BUT display the names/text-strings in the first column
    - and the values in the next column
    - to easily COPY/PASTE as a table

I will discuss the NON-numerical details in a later post.

Sincerely,

Leonard

P.S. I hope to get some time later this week to rephrase the BINOMDIST()


description. I did not forget about it, I was just too busy.

-- 
This publicly archived list offers a means to provide input to the
OASIS Open Document Format for Office Applications (OpenDocument) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: office-comment-subscribe@lists.oasis-open.org
Unsubscribe: office-comment-unsubscribe@lists.oasis-open.org
List help: office-comment-help@lists.oasis-open.org
List archive: http://lists.oasis-open.org/archives/office-comment/
Feedback License: http://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
Committee: 

http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office

ERRPROCEDURES.doc

Follow-Ups:
- Re: [office-comment] Statistical SUMMARY() function
  - From: robert_weir@us.ibm.com

References:
- Statistical SUMMARY() function
  - From: Leonard Mada <discoleo@gmx.net>
- Re: [office-comment] Statistical SUMMARY() function
  - From: robert_weir@us.ibm.com
- Re: [office-comment] Statistical SUMMARY() function
  - From: Sheldon Britton <sab_orion@verizon.net>
- Re: [office-comment] Statistical SUMMARY() function
  - From: robert_weir@us.ibm.com