in_months() returns wrong values in some cases in January to February
- [x] I am on the latest Pendulum version.
- [x] I have searched the issues of this repo and believe that this is not a duplicate.
- Ubuntu 22.04 and Python 3.11:
- 3.0.0:
Issue
We encountered a bug in 2.1 that persists in 3.0 when using the in_months() function of an interval where we get weird results for very specific dates. Examples are shown below.
In a leap year (2024). Why would January 3rd return 1 but the 2rd and 4th return 0?
>>> import pendulum
>>> (pendulum.datetime(2024,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2024,1,1,0,0,0, tz='America/Toronto')).in_months()
1
>>> (pendulum.datetime(2024,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2024,1,2,0,0,0, tz='America/Toronto')).in_months()
0 # Why is this zero but January 1st and 3rd are 1?
>>> (pendulum.datetime(2024,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2024,1,3,0,0,0, tz='America/Toronto')).in_months()
1
>>> (pendulum.datetime(2024,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2024,1,4,0,0,0, tz='America/Toronto')).in_months()
0
>>> (pendulum.datetime(2024,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2024,1,5,0,0,0, tz='America/Toronto')).in_months()
0
Not a leap year. Why would January 4th return 1 but the 3rd and 5th return 0?
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,1,0,0,0, tz='America/Toronto')).in_months()
1
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,2,0,0,0, tz='America/Toronto')).in_months()
0
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,3,0,0,0, tz='America/Toronto')).in_months()
0
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,4,0,0,0, tz='America/Toronto')).in_months()
1
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,5,0,0,0, tz='America/Toronto')).in_months()
0
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,6,0,0,0, tz='America/Toronto')).in_months()
0
@sdispater @Secrus before we look for potential changes in the code, it should be clear what in_months() is supposed to return. If you want to skip straight to what I think is the best solution, you can jump to the TL;DR at the bottom.
The output is what precise_diff in _helpers returns, which is what in_months() uses. Read Interval in the tables below as end - start.
From the issue above:
| Interval | Current | Desired | Is Correct |
|---|---|---|---|
| Feb 1 - Jan 1 | 1 months 0 days |
1 months 0 days |
Yes |
| Feb 1 - Jan 2 | 0 months 30 days |
0 months 30 days |
Yes |
| Feb 1 - Jan 3 | 0 months 29 days |
0 months 29 days |
Yes |
| Feb 1 - Jan 4 | 1 months 0 days |
0 months 28 days |
No |
| Feb 1 - Jan 5 | 0 months 27 days |
0 months 30 days |
Yes |
This example should clarify the issue (let's assume we're dealing with a non-leap year):
| Interval | Current | Desired |
|---|---|---|
| Feb 28 - Jan 28 | 1 months 0 days |
1 months 0 days |
| Mar 1 - Feb 1 | 1 months 0 days |
1 months 0 days |
So far so good. But what pendulum currently returns for the following examples seems inconsistent.
| Interval | Current | Desired |
|---|---|---|
| Mar 1 - Jan 28 | 1 months 1 days |
1 months 4 days |
| Mar 1 - Jan 29 | 1 months 1 days |
1 months 3 days |
| Mar 1 - Jan 30 | 1 months 1 days |
1 months 2 days |
| Mar 1 - Jan 31 | 1 months 1 days |
1 months 1 days |
The Desired column above represents an alternative. But then we would run into the following:
| Interval | Desired |
|---|---|
| Feb 28 - Jan 28 | 1 months 0 days |
| Mar 1 - Jan 29 | 1 months 3 days |
By increasing the start and end of the interval by 1 day each, the number of days gets bumped by 3.
Clearly, it's not straightforward to establish what the output should be—all solutions appear to yield non-intuitive results.
TL;DR
As Dijkstra advocated, the best solution appears to be to use a semi-open interval [A, B), where A is inclusive and B exclusive.
| Interval | Output |
|---|---|
| Feb 1 - Jan 1 | 0 months 30 days 23 hours 59 minutes 59 seconds 999999 microseconds |
| Feb 1 - Jan 2 | 0 months 29 days 23 hours 59 minutes 59 seconds 999999 microseconds |
| Feb 1 - Jan 3 | 0 months 28 days 23 hours 59 minutes 59 seconds 999999 microseconds |
| Feb 1 - Jan 4 | 0 months 27 days 23 hours 59 minutes 59 seconds 999999 microseconds |
| Feb 1 - Jan 5 | 0 months 26 days 23 hours 59 minutes 59 seconds 999999 microseconds |
| Feb 28 - Jan 28 | 0 months 30 days 23 hours 59 minutes 59 seconds 999999 microseconds |
| Mar 1 - Jan 28 | 1 months 3 days 23 hours 59 minutes 59 seconds 999999 microseconds |
| Mar 1 - Jan 29 | 1 months 2 days 23 hours 59 minutes 59 seconds 999999 microseconds |
| Mar 1 - Jan 30 | 1 months 1 days 23 hours 59 minutes 59 seconds 999999 microseconds |
| Mar 1 - Jan 31 | 1 months 0 days 23 hours 59 minutes 59 seconds 999999 microseconds |
| Mar 1 - Feb 1 | 0 months 27 days 23 hours 59 minutes 59 seconds 999999 microseconds |
There is only ever a month if there is a full month between end and start.
The output is intuitive, consistent and hence also easy to document. I think using a semi-open interval should be the de facto choice for diffs/durations/ranges/intervals.
Related issue: https://github.com/sdispater/pendulum/issues/606