[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

*Subject*: **YEARFRAC, basis==1: Here's the CORRECT algorithm (previouslyundocumented)**

*From*:**"David A. Wheeler" <dwheeler@dwheeler.com>***To*: office@lists.oasis-open.org*Date*: Tue, 15 Apr 2008 17:24:51 -0400 (EDT)

Below is what appears to be the correct algorithm for determining YEARFRAC when basis=1 (actual/actual). This algorithm gives the same values as Excel 2007 for all 66564 test cases of basis==1 that I have when year != 1900 (Excel is just wrong with 1900, so I've decided to not include any test cases with 1900). The algorithm that appears correct first determines if the two dates "appear to be less than or equal to a year"; if they are, it uses one algorithm, else it uses a different one. The two different sub-algorithms are similar to the two algorithms that Microsoft described. But this should show why the Microsoft XML spec is inadequate; Microsoft (1) failed to document that BOTH were in use, (2) failed to document the algorithm for determining which one to use when, and (3) incorrectly described at least one of the algorithm fragments (at least, if two dates are in the SAME year, then you ALWAYS use that year's leap-year-length even if date1-date2 doesn't cross Feb. 29; that contradicts their description). I'd love to hear if anyone knows of problems with this algorithm. This appears to pass a large number of "hairy test cases" to find all the corners, though, and that's encouraging. I present the algorithm as Python code, below. What I did is first create a spreadsheet that could recalculate, precisedly, a small set of related test cases until I could determine what APPEARED to be the right algorithm. I then implemented the algorithm, and modified it until the code gave EXACTLY the same answers for the ENTIRE large test suite (which was intentionally filled with nasty corner cases as well as "easy" cases). The code is currently ugly and unnecessarily slow, because this is the result of exploratory programming to try to find the definition at ALL. But since I finally got a result, I wanted to share it immediately. Cleanup needs to happen, but let's see if it's worth cleaning first. My expectation is that we would embed pseudocode for all the basis values in the OpenFormula specification itself, after a little cleaning-up. I chose Python because it's a clean-looking language whose external form looks a lot like pseudocode to start with. With twiddling, we could probably even embed executable Python code while still claiming that it's "just pseudocode" (that way we don't have to appeal to a Python spec). I have posted the detailed test data, code, and related material here: http://www.dwheeler.com/yearfrac/ The test suite has 268336 test cases for YEARFRAC with various basis values. Warning: It takes a little time to run through all that!! I have _not_ examined any Microsoft source or binary code to do this. I am solely looking at the inputs and outputs of their implementation, to correct the incorrect descriptions that Microsoft has publicly provided. Do we have any volunteers for another basis value? If someone wants to execute the same process for another basis value, that'd be great. I'm hoping that by "showing through example" of how to do this, others will take up the charge. My approach was first, pick a few test cases and create a spreadsheet that could _precisely_ reproduce the "correct" values. Then, I'd determine the general rule, implement it in Python, and run it against the much larger set of test data. If that failed, I'd examine the failures to see how the rule needed to be changed. --- David A. Wheeler ================================================ Code to implement basis=1, in Python (so indentation is important): def from_to(start, end): return range(start, end+1) def diffdays(ldate1, ldate2): # Report the TRUE number of days different between ldate1 and ldate2; # 0 if ldate1==ldate2, 1 if ldate1+1 == ldated2. Presumes ldate1 <= ldate2. return (ldate2 - ldate1).days * 1.0 def days_in_year(year): return diffdays(date(year, 1, 1), date(year, 12, 31)) + 1 def is_leap_year(year): return (days_in_year(year) == 366) def feb29_between(date1, date2): # Requires date2.year = (date1.year + 1) or date2.year = date1.year. # Returns True if "Feb 29" is between the two dates (date1 may be Feb29). # Two possibilities: date1.year is a leap year, and date1 <= Feb 29 y1, # or date2.year is a leap year, and date2 > Feb 29 y2. mar1_date1_year = date(date1.year, 3, 1) if ( is_leap_year(date1.year) and (date1 < mar1_date1_year) and (date2 >= mar1_date1_year)): return True mar1_date2_year = date(date2.year, 3, 1) if ( is_leap_year(date2.year) and (date2 >= mar1_date2_year) and (date1 < mar1_date2_year)): return True return False def diff_appearance_lt_year(date1, date2): # Report if date1 and date2 "appear" to be 1 year or less apart. # Requires date1 <= date2; returns boolean. if date1.year == date2.year: return True if ( ((date1.year + 1) == date2.year) and ((date1.month > date2.month) or ((date1.month == date2.month) and (date1.day >= date2.day)))): return True return False def basis1(date1,date2): # Swap so date1 <= date2 in all cases: if date1 > date2: date1, date2 = date2, date1 if date1 == date2: return 0.0 if diff_appearance_lt_year(date1, date2): if feb29_between(date1, date2): year_length = 366. elif (date1.year == date2.year and is_leap_year(date1.year)): year_length = 366. else: year_length = 365. return diffdays(date1, date2) / year_length else: days_in_years = 0.0 num_years = 0.0 for year in from_to(date1.year, date2.year): days_in_years += days_in_year(year) num_years += 1 average_year_length = days_in_years / num_years return diffdays(date1, date2) / average_year_length

**References**:**YEARFRAC, etc.***From:*robert_weir@us.ibm.com

**Re: [office] YEARFRAC, etc.***From:*"David A. Wheeler" <dwheeler@dwheeler.com>

**Re: [office] YEARFRAC, etc.***From:*Patrick Durusau <patrick@durusau.net>

**Re: [office] YEARFRAC, etc.***From:*"Andreas J. Guelzow" <aguelzow@math.concordia.ab.ca>

**Re: [office] YEARFRAC, etc.***From:*Patrick Durusau <patrick@durusau.net>

**Re: [office] YEARFRAC, etc.***From:*"Andreas J. Guelzow" <aguelzow@math.concordia.ab.ca>

**Re: [office] YEARFRAC, etc.***From:*Patrick Durusau <patrick@durusau.net>

**Re: [office] YEARFRAC, etc.***From:*robert_weir@us.ibm.com

**Re: [office] YEARFRAC, etc.***From:*Patrick Durusau <patrick@durusau.net>

**Re: [office] YEARFRAC, etc.***From:*robert_weir@us.ibm.com

**Re: [office] YEARFRAC, etc.***From:*Patrick Durusau <patrick@durusau.net>

**Re: [office] YEARFRAC, etc.***From:*robert_weir@us.ibm.com

**Re: [office] YEARFRAC, etc.***From:*Patrick Durusau <patrick@durusau.net>

**Re: [office] YEARFRAC, etc.***From:*robert_weir@us.ibm.com

**Re: [office] YEARFRAC, etc.***From:*Patrick Durusau <patrick@durusau.net>

**Re: [office] YEARFRAC, etc.***From:*Andreas J Guelzow <aguelzow@math.concordia.ab.ca>

**Re: [office] YEARFRAC, etc.***From:*Patrick Durusau <patrick@durusau.net>

**Re: [office] YEARFRAC, etc.***From:*Andreas J Guelzow <aguelzow@math.concordia.ab.ca>

**YEARFRAC: Why both Microsoft XML definitions are wrong, and nextsteps***From:*"David A. Wheeler" <dwheeler@dwheeler.com>

**Re: [office] YEARFRAC: Why both Microsoft XML definitions are wrong,and next steps***From:*Patrick Durusau <patrick@durusau.net>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]