dateparser icon indicating copy to clipboard operation
dateparser copied to clipboard

Q: How to get timedelta from a relative time?

Open rmax opened this issue 7 years ago • 4 comments

Hi!

I couldn't find a way to get a timedelta from a string like 3 hours ago rather than a datetime.

The use case is: I have a column when with values like 3 hours ago and a timestamp with a datetime value, so I want to do something like timestamp - dateparser.parse_timedelta("3 hours ago").

rmax avatar Jul 22 '16 21:07 rmax

It sounds like passing timestamp as a RELATIVE_BASE value can do the trick.

kmike avatar Sep 07 '16 23:09 kmike

@rmax Since you can use something as simple as timestamp - dateparser.parse_timedelta("3 hours ago"), as you suggest, does it make sense to change the dateparser API for this?

Gallaecio avatar Apr 01 '19 18:04 Gallaecio

I needed something similar and I realized that maybe is not a bad idea to add a method to retrieve the timedelta instead of a date object.

As dateparser needs some time to parse a date, when doing datetime.now()-parse('yesterday') we loss precision. I just opened a draft PR (https://github.com/scrapinghub/dateparser/pull/623) to illustrate it.

Examples (using the code in the PR):

In: from dateparser import parser, parse_timedelta


In: str(datetime.now()-parse('13 min ago'))                                                                                                                                                                
Out: '0:12:59.993999'

In: str(parse_timedelta('13 min ago'))                                                                                                                                                                     
Out: '0:13:00.000139'

In: str(datetime.now()-parse('yesterday'))                                                                                                                                                                 
Out: '23:59:59.997510'

In: str(parse_timedelta('yesterday'))                                                                                                                                                                      
Out: '1 day, 0:00:00.000182'

In: str(datetime.now() - parse('1時間13分')) 
Out: '1:12:59.997616'

In: str(parse_timedelta('1時間13分') )                                                                                                                                                                     
Out: '1:13:00.000305'

Of course we could add some corrections (in the order of microseconds) and warn about that in the docs, but the point here is that this library aims to help with date dealing, and this could be really useful for some people.

What do you think?

noviluni avatar Feb 28 '20 14:02 noviluni

This should do:

from datetime import datetime, timedelta
from dateutil.relativedelta import relativedelta

_base = datetime(1, 2, 1)  # any fixed date will do

def _parse_relative_time(text: str) -> datetime | None:
    return dateparser.parse(text, settings={"RELATIVE_BASE": _base, "PARSERS": ["relative-time"]})

def get_timedelta(text: str) -> timedelta | None:
    if parsed_date := _parse_relative_time(text):
        return parsed_date - _base

def get_relativedelta(text: str) -> relativedelta | None:
    if parsed_date := _parse_relative_time(text):
        return relativedelta(parsed_date, _base)
>>> get_timedelta("in 3 days")
datetime.timedelta(days=3)
>>> get_relativedelta("in 3 month")
relativedelta(months=+3)

However it prone to errors, such as:

>>> get_relativedelta("in 90 days")
relativedelta(months=+3)

When in fact, it should return relativedelta(days=+90) because it's not the same as 3 months.

So function like parse_timedelta that would return actual relativedelta from text is really welcome.

Bobronium avatar May 23 '22 14:05 Bobronium