tidypolars icon indicating copy to clipboard operation
tidypolars copied to clipboard

Calculating time

Open exsell-jc opened this issue 2 years ago • 5 comments

In R with lubridate, it would look like this:

one_year_before = some_date - years(1)
one_year_before = some_date - months(12)

But in tidypolars functions list, there doesn't seem to be a years or months function: https://tidypolars.readthedocs.io/en/latest/reference.html

exsell-jc avatar Oct 10 '22 08:10 exsell-jc

Hmm I'll have to look into this more. At first glance there isn't a "simple translation". There might have to be a tidypolars helper for this.

markfairbanks avatar Oct 10 '22 17:10 markfairbanks

For now I'd recommend using col().dt.offset_by() straight from polars.

FYI this method is built into the col() expression, so you don't need to import polars for this to work.

import tidypolars as tp
from tidypolars import col

df = tp.Tibble(date = ['2021-01-01', '2021-10-01']).mutate(date = col('date').str.strptime(tp.Date))

(
    df
    .mutate(minus_two_months = col('date').dt.offset_by("-2mo"),
            add_year = col('date').dt.offset_by("1y"))
)
┌────────────┬──────────────────┬────────────┐
│ date       ┆ minus_two_months ┆ add_year   │
│ ---        ┆ ---              ┆ ---        │
│ date       ┆ date             ┆ date       │
╞════════════╪══════════════════╪════════════╡
│ 2021-01-01 ┆ 2020-11-01       ┆ 2022-01-01 │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2021-10-01 ┆ 2021-08-01       ┆ 2022-10-01 │
└────────────┴──────────────────┴────────────┘

markfairbanks avatar Oct 10 '22 19:10 markfairbanks

Workaround (not exactly a solution) -- use polars inbetween, not as elegant, but works

I'm having problem with filtering/date comparison. I just added the filter() pipe to your code.

monthss = 6
(
    df
    .mutate(minus_two_months = col('date').dt.offset_by("-2mo"),
            add_year = col('date').dt.offset_by("1y"))
    .filter(col('date') > max(col('date')).dt.offset_by(f'-{monthss}mo'))
)

Here is R's equivalent:

monthss = 6
df |>
  mutate(minus_two_months = date - months(2),
         add_year = date - years(1)) |>
  filter(date > max(date) - months(monthss))

exsell-jc avatar Dec 01 '22 12:12 exsell-jc

(Link is dead) Reporting another issue related -- calculation of date - date = x days (as duration, so it would be ddays(1)) e.g. 2000-01-01 - 1999-12-31 = 1 day Compensating for leap years and other anomalies would be nice also.

PathosEthosLogos avatar Apr 30 '23 15:04 PathosEthosLogos