arrow icon indicating copy to clipboard operation
arrow copied to clipboard

Refactor humanize() method, expand usage, fix edge cases

Open systemcatch opened this issue 4 years ago • 11 comments

.humanize() is getting kind of long and could do with refactoring. We'd also like to explore allowing other inputs such as date and timedelta objects.

systemcatch avatar Jan 03 '20 15:01 systemcatch

Hello, I am bored at home so I decide to take a look at this issue. After examining the code, I feel the humanize() is flawed.

import arrow
x =arrow.now()
y =x.shift(seconds=46)

x.humanize(y)
'a minute ago'
x.humanize(y,granularity='minute')

'0 minutes ago'

Specify granularity should not cause an inconsistent result

Songyu-Wang avatar Apr 24 '20 01:04 Songyu-Wang

I feel a better approach is to do something like this:

true_granularity is the granularity
if granularity=auto:
  figure out and update true_granularity
do the calculation with true_granularity
handle the list granularities

Songyu-Wang avatar Apr 24 '20 01:04 Songyu-Wang

After we fix #848 we're left with a few edge cases.

Months still present a problem.

(arrow) chris@ThinkPad:~/arrow$ python
Python 3.8.3 (default, Jul  7 2020, 18:57:36) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import arrow
>>> dt=arrow.utcnow()
>>> dt
<Arrow [2020-11-08T15:12:18.911919+00:00]>
>>> later=dt.shift(months=+1)
>>> later
<Arrow [2020-12-08T15:12:18.911919+00:00]>
>>> dt.humanize(later)
'4 weeks ago'
>>> later.humanize(dt)
'in 4 weeks'

Also leap years will need dealing with.

systemcatch avatar Nov 08 '20 16:11 systemcatch

Hi, I can take a look at fixing the edge cases.

Is it supposed to print "1 month ago" and "in 1 month" for the month+=1 example?

@systemcatch

yiransii avatar Dec 02 '20 21:12 yiransii

@yiransii yes that is what we would expect to happen here.

systemcatch avatar Dec 03 '20 10:12 systemcatch

@systemcatch So the edge case happens because a month might have 30 or 31 days, but self._SECS_PER_MONTH: is always set to the value of 31 days in the condition elif diff < self._SECS_PER_MONTH:

In the above month+=1 example, dt=arrow.get("2020-11-08T15:12:18.911919+00:00") returns 4 weeks cause Nov has 30 days, which is smaller than 31 days, so it enters the if statement and uses weeks to describe the delta.

Pull request submitted! I have also added test cases to test humanize with the month+=1 delta for each month. https://github.com/arrow-py/arrow/pull/894

yiransii avatar Dec 10 '20 05:12 yiransii

Once this is fixed, we should revert coverage requirement to 100%: https://github.com/arrow-py/arrow/issues/749.

jadchaar avatar Mar 02 '21 20:03 jadchaar

What level of accuracy are we seeking? I can working on a solution with similar accuracy to https://www.timeanddate.com/date/duration.html, however, this may not necessarily be the most efficient solution.

anishnya avatar Apr 25 '21 20:04 anishnya

I'm still seeing this on arrow 1.2.2, for example:

>>> a_month_ago = datetime.datetime.now() + dateutil.relativedelta.relativedelta(months=-1)
>>> arrow.get(a_month_ago).humanize()
'4 weeks ago'

This should be a month ago. Are you working on a fix, or there is a different way to do this?

Thanks

olivebay avatar May 06 '22 09:05 olivebay

Hi @olivebay, we're currently evaluating various ways to improve this. The current method for calculating relative dates relies upon fairly basic math operations and assumptions. Our concern with implementing something more accurate and precise would be reduced performance for batch processing jobs. Most likely we'd either introduce a flag or new method for more precise results (similar to https://www.timeanddate.com/date/duration.html).

anishnya avatar May 07 '22 21:05 anishnya

thanks @anishnya

olivebay avatar May 09 '22 08:05 olivebay