polars icon indicating copy to clipboard operation
polars copied to clipboard

dt.truncate does not work correctly for weeks

Open mjkanji opened this issue 1 year ago • 3 comments

Polars version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of Polars.

Issue description

The dt.truncate function is incorrectly pegging the start of the week to Thursday, instead of Monday. This may be related #5507 but that's focused on timezone issues, whereas here I have a standard date.

Reproducible example

import polars as pl
from datetime import date, timedelta

start = date(2022,11, 14)
end = date(2022,11, 20)
print(
    pl.date_range(start, end, timedelta(days=1), name="dates")
    .to_frame()
    .select([
        pl.col("dates"),
        pl.col("dates").dt.strftime("%A").alias("day_of_week"),
        pl.col("dates").dt.truncate("1w").alias("truncated"),
        pl.col("dates").dt.week().alias("week")
    ])
)

shape: (7, 4)
┌────────────┬─────────────┬────────────┬──────┐
│ dates      ┆ day_of_week ┆ truncated  ┆ week │
│ ---        ┆ ---         ┆ ---        ┆ ---  │
│ date       ┆ str         ┆ date       ┆ u32  │
╞════════════╪═════════════╪════════════╪══════╡
│ 2022-11-14 ┆ Monday      ┆ 2022-11-10 ┆ 46   │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2022-11-15 ┆ Tuesday     ┆ 2022-11-10 ┆ 46   │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2022-11-16 ┆ Wednesday   ┆ 2022-11-10 ┆ 46   │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2022-11-17 ┆ Thursday    ┆ 2022-11-17 ┆ 46   │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2022-11-18 ┆ Friday      ┆ 2022-11-17 ┆ 46   │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2022-11-19 ┆ Saturday    ┆ 2022-11-17 ┆ 46   │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2022-11-20 ┆ Sunday      ┆ 2022-11-17 ┆ 46   │
└────────────┴─────────────┴────────────┴──────┘

Expected behavior

Expected behaviour is for truncating by a week to peg the start of the week to Monday (i.e., all the days with the same value for the week column above should have the same truncated value). In the example above, the week starts on the 14th and ends on the 20th, so the whole truncated column should have only one unique value.

Installed versions

---Version info---
Polars: 0.14.29
Index type: UInt32
Platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:36:39) [GCC 10.4.0]
---Optional dependencies---
pyarrow: <not installed>
pandas: 1.5.1
numpy: 1.23.4
fsspec: 2022.11.0
connectorx: <not installed>
xlsx2csv: <not installed>
matplotlib: 3.6.2

mjkanji avatar Nov 20 '22 00:11 mjkanji