polars
polars copied to clipboard
feat(rust): truncate by calendar weeks
When the 'w' duration parameter is used, truncate and round by calendar weeks instead of nano seconds. Breaking change if 'w' was used to truncate by weeks.
Closes #5557
I think this works. Can you fix the python tests and see if those still make sense?
The change affects now only truncate
or round
by weeks. groupby_rolling
still behaves like before, '7d' and '1w' result both in the same outcome.
@cannero how come this doesn't affect the groupby_rolling/groupby_dynamic? They utilize the same duration computations right?
Should the groupby functions also group by calendar weeks? I just have to think if a mix of days and weeks at the period and offset values makes a problem.
Right now for the groupby to work the duration check in groupby.rs had to be changed from:
if offset.nanoseconds() >= period.nanoseconds() {
to:
if offset.duration_ns() >= period.duration_ns() {
I set the PR back to draft as I have to check another groupby_rolling example.
@ritchie46 Right now groupby_rolling/groupby_dynamic uses the Duration::add_ns/us/ms, which works now also with weeks. Did you expect there other changes?
One question about the function to get the nanoseconds of a Duration, because I had to change it at some places.
The nanoseconds
function returns only the nanoseconds without the weeks or months part, duration_ns
includes also the week and month part.
Should we rename the nanoseconds function, so it is not used by accident?
nanoseconds
is still used in rolling_window, where it makes no problem now as only index counts are now allowed.
One question about the function to get the nanoseconds of a Duration, because I had to change it at some places. The
nanoseconds
function returns only the nanoseconds without the weeks or months part,duration_ns
includes also the week and month part. Should we rename the nanoseconds function, so it is not used by accident?nanoseconds
is still used in rolling_window, where it makes no problem now as only index counts are now allowed.
Python has a total_seconds()
method on timedelta
objects to distinguish this case; perhaps we can take similar inspiration? (eg: total_(nano|micro|milli)seconds
)
Should we rename the nanoseconds function, so it is not used by accident? nanoseconds is still used in rolling_window, where it makes no problem now as only index counts are now allowed.
Yes, I think it makes sense to make it clear that you get access the nsecs
attribute instead of the duration in nanoseconds. I think this can be done in a follow up PR.
Thanks for the contribution @cannero.