arrow Refactor humanize() method, expand usage, fix edge cases

Refactor humanize() method, expand usage, fix edge cases

Open systemcatch opened this issue 4 years ago • 11 comments

.humanize() is getting kind of long and could do with refactoring. We'd also like to explore allowing other inputs such as date and timedelta objects.

Jan 03 '20 15:01 systemcatch

Hello, I am bored at home so I decide to take a look at this issue. After examining the code, I feel the humanize() is flawed.

import arrow

x =arrow.now()
y =x.shift(seconds=46)

x.humanize(y)

'a minute ago'

x.humanize(y,granularity='minute')

'0 minutes ago'

Specify granularity should not cause an inconsistent result

Apr 24 '20 01:04 Songyu-Wang

I feel a better approach is to do something like this:

true_granularity is the granularity
if granularity=auto:
  figure out and update true_granularity
do the calculation with true_granularity
handle the list granularities

Apr 24 '20 01:04 Songyu-Wang

After we fix #848 we're left with a few edge cases.

Months still present a problem.

(arrow) chris@ThinkPad:~/arrow$ python
Python 3.8.3 (default, Jul  7 2020, 18:57:36) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import arrow
>>> dt=arrow.utcnow()
>>> dt
<Arrow [2020-11-08T15:12:18.911919+00:00]>
>>> later=dt.shift(months=+1)
>>> later
<Arrow [2020-12-08T15:12:18.911919+00:00]>
>>> dt.humanize(later)
'4 weeks ago'
>>> later.humanize(dt)
'in 4 weeks'

Also leap years will need dealing with.

Nov 08 '20 16:11 systemcatch

Hi, I can take a look at fixing the edge cases.

Is it supposed to print "1 month ago" and "in 1 month" for the month+=1 example?

@systemcatch

Dec 02 '20 21:12 yiransii

@yiransii yes that is what we would expect to happen here.

Dec 03 '20 10:12 systemcatch

@systemcatch So the edge case happens because a month might have 30 or 31 days, but self._SECS_PER_MONTH: is always set to the value of 31 days in the condition elif diff < self._SECS_PER_MONTH:

In the above month+=1 example, dt=arrow.get("2020-11-08T15:12:18.911919+00:00") returns 4 weeks cause Nov has 30 days, which is smaller than 31 days, so it enters the if statement and uses weeks to describe the delta.

Pull request submitted! I have also added test cases to test humanize with the month+=1 delta for each month. https://github.com/arrow-py/arrow/pull/894

Dec 10 '20 05:12 yiransii

Once this is fixed, we should revert coverage requirement to 100%: https://github.com/arrow-py/arrow/issues/749.

Mar 02 '21 20:03 jadchaar

What level of accuracy are we seeking? I can working on a solution with similar accuracy to https://www.timeanddate.com/date/duration.html, however, this may not necessarily be the most efficient solution.

Apr 25 '21 20:04 anishnya

I'm still seeing this on arrow 1.2.2, for example:

>>> a_month_ago = datetime.datetime.now() + dateutil.relativedelta.relativedelta(months=-1)
>>> arrow.get(a_month_ago).humanize()
'4 weeks ago'

This should be a month ago. Are you working on a fix, or there is a different way to do this?

Thanks

May 06 '22 09:05 olivebay

Hi @olivebay, we're currently evaluating various ways to improve this. The current method for calculating relative dates relies upon fairly basic math operations and assumptions. Our concern with implementing something more accurate and precise would be reduced performance for batch processing jobs. Most likely we'd either introduce a flag or new method for more precise results (similar to https://www.timeanddate.com/date/duration.html).

May 07 '22 21:05 anishnya

thanks @anishnya

May 09 '22 08:05 olivebay

arrow arrow copied to clipboard

Refactor humanize() method, expand usage, fix edge cases

arrow
arrow copied to clipboard