Subject: [OASIS Issue Tracker] Commented: (OFFICE-2309) LEGACY.CHITEST

• From: OASIS Issues Tracker <workgroup_mailer@lists.oasis-open.org>
• To: office@lists.oasis-open.org
• Date: Wed, 13 Jan 2010 23:04:15 -0500 (EST)

http://tools.oasis-open.org/issues/browse/OFFICE-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479#action_17479

Andreas Guelzow  commented on OFFICE-2309:
You are confusing the tests: There is a good of it test that requires the expected values and you do not have an n by m _table_. There is a test of independence and a test of homogeneity that calculate the expected values from row and column totals.

All of these test use the chisquare distribution  but do test differnet hypothesis. This function perform a goodness of fit test but uses the degrees of freedom for a goodnessof independence/homogeneity test if the data is in an n by m table.

> LEGACY.CHITEST
>                 Key: OFFICE-2309
>                 URL: http://tools.oasis-open.org/issues/browse/OFFICE-2309
>             Project: OASIS Open Document Format for Office Applications (OpenDocument) TC
>          Components: OpenFormula
>    Affects Versions: ODF 1.2
>            Reporter: Robert Weir
> 3.) LEGACY.CHITEST
> > Note: Applications usually describe the CHITEST function as a
> > Chi-square independence test. From a mathematical point of view this
> > is not correct, as that would not involve testing some actual data
> > against a set of expected values. It resembles more a Goodness-for-Fit
> > test, but how the degree of freedom is calculated actually doesn't
> > make sense then. This is specified to be inter operable with Excel and
> > OpenOffice.org. Gnumeric gets different results if the number of rows
> > and columns both are greater than 2.
> Well, I suggest comparing the results with the FISHER-EXACT test, e.g. in R.
> Also, every statistical package (R, EPI INFO, SPSS, ...) do NOT need the
> expected values, as they compute them automatically from the n*m table.
> I wonder why spreadsheets do NOT do it automatically, as well. Most
> users simply fail to compute the correct value. Well, I try to teach
> them, but almost everyone will get it wrong a week later. [It is way
> easier to remember the shortcut for a 2x2 table, aka (ad-bc)^2 * N/(n1 *
> n2 * n3 * n4), where n1-4 are the 4 subtotals, then compute accurately
> the expected values.]

