OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-formula message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [office-formula] Conflict between Part 1 and Part 2


I think that puts us back where we started.  We use the null-date and acknowledge that dates between 1-Jan and 28-Feb may be different.  No further attributes are needed.  Anyone disagree to this strategy?

-Eric

-----Original Message-----
From: robert_weir@us.ibm.com [mailto:robert_weir@us.ibm.com] 
Sent: Thursday, February 04, 2010 3:33 PM
To: Eric Patterson
Cc: office-formula@lists.oasis-open.org
Subject: RE: [office-formula] Conflict between Part 1 and Part 2

Eric Patterson <ericpa@exchange.microsoft.com> wrote on 02/04/2010 
05:22:48 PM:

> 
> The number of existing documents with this bug should have some 
> relevance.  There are billions if not trillions of documents out 
> there with this bug.  The days between 1-Jan-1900 and 28-FEB-1900 
> are less important than being able to transfer documents between 
> applications in a way that insures that Dates, Serial Dates and all 
> calculations involving them for dates after 1-Mar-1900 are correct. 
> 

The solution involves application code, not documents.  You write the code 
once and then it will work whether you then load one legacy document or 
billions.  So net effort is proportionate to the number of applications 
that need to change their code, not the number of documents that they then 
load.  That is why I said that the number of legacy documents was not 
relevant.

> Having a flag to indicate whether 29-Feb-1900 should be allowed as a
> date is fine, but there needs to be a definitive way of defining 
> what serial number represents 1-Mar-1900 such that date calculations
> can occur.  Note that since dates can be serial numbers they are 
> often included in formulas as such and cannot be identified as dates
> or converted during a save operation.
> 

The interesting thing is that Excel, Gnumeric, Google, OpenOffice, 
Symphony all agree that today is 40,213.  In fact, we seem to all agree on 
the date->number conversions for all dates except for 1/1/1900-3/1/1900.

So that is good.

However, as far as I can tell, only Excel has that *1900-02-29 date. What 
appears to have happened is that other apps set their epoch to be 
12-31-1899, giving that extra day that Excel inserts for *1900-02-29. 

And as far as I can tell, the same apps, when they load Excel XLS 
spreadsheets with the date *1900-02-29, all get it wrong.  Most display it 
as 1900-02-28.  One displays it as 1900-03-01.  I imagine the same thing 
is true in the reverse.  If I create a spreadsheet in OpenOffice/Google 
Docs/Gumeric or whatever, with a date between 1900-01-01 and 1900-02-28 
and save it to XLS format, Excel will display a different serial number 
for that date than the authoring application did.  Ditto for OOXML. 
Apparently the world has not ended, so methinks there are not many 
documents that actually trigger this bug.

I don't think we will realistically see any spreadsheet vendor change the 
way they calculate date serial numbers. Do you?  If we added a boolean 
flag for 1900-leap year calculations, would Excel change their code to 
calculate serial numbers for dates between 1900-01-01 and 1900-02-28 
according to spec when loading a document where the flag was set to 
indicate that 1900 was not a leap year?

So I wonder whether the solution should look something like this:

1) Excel, when interpreting the null-date parameter, adds 1 day to it if 
null-date is before 1900-03-01, to account for the *1900-02-29 bug.  So if 
the document days the null-date is 1899-12-31, you treat it as 01-01-1900. 
 If null-date is 1904-01-01, however, you don't change it.

2) Excel/Windows, when writing an ODF spreadsheet, sets null-date to be 
1899-12-31, essentially subtracting the non-existant 1900-02-29.  On Mac, 
if you have the default 1904 origin, you write it out as 1904-01-01.

3) Other apps write out origin as 1899-12-31

4) We live with the non-correspondence of date serial numbers between 
1900-01-01 and 1900-03-01, as well seem to already be doing with ODF 1.1, 
XLS and XLSX spreadsheets today, without apparent harm.  If we want we 
could make a statement in the specification to the effect that 
calculations involving these dates are implementation-defined. 

I think that would allow Excel and other apps to conform, as well as give 
fair warning that date calculations in January/February 1900 are not 
portable.

I'm looking at this like any other issue where there are calculation 
differences.  For example, 0^0.  It is an error in Excel and Gnumeric. But 
it evaluates to 1 in OpenOffice and Google.  Now, in theory we could add a 
special sheet-level boolean attribute called zero-power-zero, and allow 
each app to declare what assumptions it is making. But realistically, is 
anyone going to go in and add support to modify their calculations based 
on the value of that flag? 

We could go right down the list and for each of dozens of 
implementation-defined behaviors -- distinct boolean type, treatment of 
complex numbers in operations, string to number conversions, numerical 
limits, max string length, etc. -- and define an attribute for each one. 
But is anyone actually going to change their implementation?


-Rob



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]