OfficeDocs-Support
OfficeDocs-Support copied to clipboard
Incorrect Reasoning about Excel Leap Day Error
Your claim that "If the behavior remains uncorrected, only one problem occurs" (namely the WEEKDAY problem for dates before the non-existent year-1900 leap day) is incorrect. There is at least one more problem: The calculation of the number of days in any interval that includes the non-existent leap day yields a result that is one day too large. For example, if in Excel you subtract February 28th, 1900 from March 1st, 1900 then you get 2, but the correct answer is 1.
Your claim that "If this behavior were to be corrected, many problems would arise" is incorrect. It is based on the assumption that the correction must be to decrease by one the internal day number of all dates after the non-existent leap day and leaving the relationship unchanged for earlier dates, when the correction can just as well be to increase by one the internal day number of all dates before the non-existent leap day and leaving the relationship unchanged for later dates.
You state that "most users do not use dates before March 1, 1900" as an argument why the "only one problem" with the current behavior (the WEEKDAY problem for earlier dates) is rare. With that same argument, fixing the leap day problem by increasing the internal day number of all dates before that non-existent leap day would rarely cause problems in practice, either.
If you do not fix this leap day problem, then WEEKDAYs before the non-existent leap day remain wrong, and date differences spanning that leap day remain wrong. If you fix the problem, then you can choose for which range of dates everything stays the same: either for all dates before the non-existent leap day, or for all dates after the non-existent leap day, but not for both. And then the WEEKDAY and date difference calculations will be correct for all dates (or at least won't be wrong due to this non-existent leap day anymore).
Regards, Louis Strous
Document Details
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
- ID: e28dce76-cc6a-3af7-b51b-48ea6ba8b3ef
- Version Independent ID: c691ecbb-6184-a73b-2f25-2d72eccd8a14
- Content: Excel incorrectly assumes that the year 1900 is a leap year - Office
- Content Source: Office/Client/excel/wrongly-assumes-1900-is-leap-year.md
- Product: office-perpetual-itpro
- GitHub Login: @Lucciz
- Microsoft Alias: v-zolu
@LouisStrous thank you for reaching out and helping improve MS Docs. I'll look into this.
@LouisStrous thank you for your valuable input. With regards to the discrepancy you found in the calculation for dates, is there a possible workaround you can suggest? I'm not too sure there is one as this article is explaining why it is a tricky one...Please have another look and let me know what you think. I see the article has also been updated recently.
@Reeced40 I am not sure exactly what you're asking, so my apologies if my response is not helpful.
The simplest solution to the documentation problem that I reported is to fix the documentation.
If you want to improve the date calculations so the WEEKDAY function always returns the correct result and calculations of the number of days between two dates spanning the non-existent year-1900 leap day, then the least painful solution is to increase by 1 the running day number (as returned by the DATEVALUE function or as displayed when entering a date value in an Excel cell and then changing the format to "General") of dates before March 1st, 1900, so that February 28th, 1900 yields running day number 60 instead of the 59 that is currently associated with it. Then calculations of the number of days between two dates on either side of non-existent "February 29th, 1900" (by subtracting two dates) yield the correct results, and WEEKDAY (which I assume is based on the remainder of dividing the running day number by 7) yields the correct result for every valid date that Excel supports, and there is "serial date compatibility" with older versions of Excel for any dates from March 1st, 1900.
The simplest general formula for calculating a running day number from the year number, month number, and day-of-the-month number in the Gregorian calendar correctly omits a leap day from the year 1900, without needing any special handling of just that year. It follows that the formula that Excel currently uses must not be the simplest, so fixing the problem might even make the calculations that Excel does under the hood simpler. (It is a trivial matter to adjust the formula so it matches the current DATEVALUE results for dates from March 1, 1900.)
I hadn't realized before that Excel does not support any dates before the year 1900. This means that fixing the problem as I suggest only affects 60 days -- those from January 0th, 1900 (which Excel accepts) through February 28th, 1900. There is no technical reason anymore why Excel couldn't support dates before the year 1900. The formula I mentioned before works perfectly fine arbitrarily far into the past and future.
Regarding "Correcting this behavior would break serial date compatibility between Microsoft Excel and other programs that use dates": That statement is only correct for "other programs" that exhibit the exact same bug as Excel, so it is not correct in general. It might be correct to state "... and other Microsoft programs that use dates".
Hmm, the "serial date compatibility" issue is a tricky one. Some users of Excel or of other programs that ingest data produced by Excel may have written special code to work around the flaw in Excel's running day number. If you were to adjust Excel as I proposed before, then those users would have to adjust their special code accordingly, which may be a big problem. Maybe that original source code is no longer (easily) available. Maybe everyone who knows how to adjust that code -- or that there is such special code in their programs -- has retired by now. Or maybe there are no such users (anymore).
On the other hand, those users would have had to notice the flaw in the first place, and then write the special code to work around it. That is a grave annoyance they could have done without. And the same goes for future users of Excel for which this flaw turns out to be relevant.
If you want to avoid any headache for users of programs that by now rely on the flaw being present, and yet make life easier for users that would prefer to avoid having to work around the flaw in new programs, then it seems to me your only option is to introduce new Excel functions that work with running day numbers without the flaw. You could tie them to some well-established running day number scheme, such as the Julian Day (JD), or the Modified Julian Day (MJD). There could be a new function DATETOMJD that produces a Modified Julian Day number given a date or given an old-style Microsoft running day number (as produced by the DATEVALUE function), and MJDTODATE for the reverse. And there could be an MJDTOWEEKDAY to calculate the weekday from the MJD.
Those new functions DATETOMJD and MJDTODATE could support a wider range of dates than the current Excel date functions do -- in particular, they could support dates well before the year 1900, which are of great interest at least to astronomers and historians. You'll then have to decide how to handle the historical switch from the Julian to the Gregorian calendar, which was adopted at different times by different countries. You could ignore the switch and use the (proleptic) Gregorian calendar for all dates, or introduce options to indicate the desired calendar in the new functions.
@LouisStrous thank you for your valuable input. I'll send this to the team for possible future revisions of the article.