fix: Handle DST transitions in group_by_dynamic (#25410)
Fixes #25410.
group_by_dynamic panicked when window boundaries fell on non-existent datetimes during DST transitions, even when the actual data didn't contain those times.
Reproduction
import polars as pl
import datetime as dt
df = pl.DataFrame({
"timestamp": pl.datetime_range(
dt.datetime(2024, 2, 7, 9, 31),
dt.datetime(2024, 2, 24, 16, 0),
time_zone="America/New_York",
eager=True,
interval="1m"
)
}).join(pl.DataFrame({"id": [1, 2, 3]}), how="cross")
df.group_by_dynamic(
"timestamp",
every="1m",
period=dt.timedelta(days=17),
closed="both",
label="left",
group_by="id"
).agg(ts_last=pl.col("timestamp").last())
# Panics: datetime '2024-03-10 02:00:00' is non-existent in time zone 'America/New_York'
Cause
The BoundsIter in polars-time calls .unwrap() on duration addition when computing window boundaries. With a 17-day period starting from Feb 7, the window boundaries include March 10, 2024 2:00 AM—a non-existent time in America/New_York when DST springs forward to 3:00 AM. This triggers try_localize_datetime to raise an error with NonExistent::Raise.
Fix
Added fallback helper functions that catch non-existent datetime errors during window boundary calculations. When a boundary falls on a DST gap, the code skips forward by 1 hour and retries. This maintains performance in the common case (just a Result check) while gracefully handling DST transitions.
Codecov Report
:x: Patch coverage is 91.51671% with 33 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 79.68%. Comparing base (f6a9da8) to head (fd4bb14).
Additional details and impacted files
@@ Coverage Diff @@
## main #25459 +/- ##
==========================================
- Coverage 80.57% 79.68% -0.90%
==========================================
Files 1764 1764
Lines 242864 243204 +340
Branches 3044 3044
==========================================
- Hits 195697 193804 -1893
- Misses 46385 48618 +2233
Partials 782 782
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
I feel this is a bit dangerous. We don't check if the error is indeed a DST transition, and even if true I'm not sure if skipping a full hour is always the correct behavior.
@orlp Applied your feedback. The fix now uses a fallback helper that catches non-existent datetime errors during window boundary calculations and skips forward past the DST gap when a boundary falls on it.