arrow
arrow copied to clipboard
Refactor humanize() method, expand usage, fix edge cases
.humanize()
is getting kind of long and could do with refactoring. We'd also like to explore allowing other inputs such as date
and timedelta
objects.
Hello, I am bored at home so I decide to take a look at this issue. After examining the code, I feel the humanize() is flawed.
import arrow
x =arrow.now()
y =x.shift(seconds=46)
x.humanize(y)
'a minute ago'
x.humanize(y,granularity='minute')
'0 minutes ago'
Specify granularity should not cause an inconsistent result
I feel a better approach is to do something like this:
true_granularity is the granularity
if granularity=auto:
figure out and update true_granularity
do the calculation with true_granularity
handle the list granularities
After we fix #848 we're left with a few edge cases.
Months still present a problem.
(arrow) chris@ThinkPad:~/arrow$ python
Python 3.8.3 (default, Jul 7 2020, 18:57:36)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import arrow
>>> dt=arrow.utcnow()
>>> dt
<Arrow [2020-11-08T15:12:18.911919+00:00]>
>>> later=dt.shift(months=+1)
>>> later
<Arrow [2020-12-08T15:12:18.911919+00:00]>
>>> dt.humanize(later)
'4 weeks ago'
>>> later.humanize(dt)
'in 4 weeks'
Also leap years will need dealing with.
Hi, I can take a look at fixing the edge cases.
Is it supposed to print "1 month ago" and "in 1 month" for the month+=1 example?
@systemcatch
@yiransii yes that is what we would expect to happen here.
@systemcatch
So the edge case happens because a month might have 30 or 31 days, but self._SECS_PER_MONTH:
is always set to the value of 31 days in the condition elif diff < self._SECS_PER_MONTH:
In the above month+=1 example, dt=arrow.get("2020-11-08T15:12:18.911919+00:00")
returns 4 weeks
cause Nov has 30 days, which is smaller than 31 days, so it enters the if statement and uses weeks
to describe the delta.
Pull request submitted! I have also added test cases to test humanize
with the month+=1
delta for each month.
https://github.com/arrow-py/arrow/pull/894
Once this is fixed, we should revert coverage requirement to 100%: https://github.com/arrow-py/arrow/issues/749.
What level of accuracy are we seeking? I can working on a solution with similar accuracy to https://www.timeanddate.com/date/duration.html, however, this may not necessarily be the most efficient solution.
I'm still seeing this on arrow 1.2.2, for example:
>>> a_month_ago = datetime.datetime.now() + dateutil.relativedelta.relativedelta(months=-1)
>>> arrow.get(a_month_ago).humanize()
'4 weeks ago'
This should be a month ago. Are you working on a fix, or there is a different way to do this?
Thanks
Hi @olivebay, we're currently evaluating various ways to improve this. The current method for calculating relative dates relies upon fairly basic math operations and assumptions. Our concern with implementing something more accurate and precise would be reduced performance for batch processing jobs. Most likely we'd either introduce a flag or new method for more precise results (similar to https://www.timeanddate.com/date/duration.html).
thanks @anishnya