polars icon indicating copy to clipboard operation
polars copied to clipboard

fix: Handle DST transitions in group_by_dynamic (#25410)

Open wtn opened this issue 1 month ago • 3 comments

Fixes #25410.

group_by_dynamic panicked when window boundaries fell on non-existent datetimes during DST transitions, even when the actual data didn't contain those times.

Reproduction

import polars as pl
import datetime as dt

df = pl.DataFrame({
    "timestamp": pl.datetime_range(
        dt.datetime(2024, 2, 7, 9, 31),
        dt.datetime(2024, 2, 24, 16, 0),
        time_zone="America/New_York",
        eager=True,
        interval="1m"
    )
}).join(pl.DataFrame({"id": [1, 2, 3]}), how="cross")

df.group_by_dynamic(
    "timestamp",
    every="1m",
    period=dt.timedelta(days=17),
    closed="both",
    label="left",
    group_by="id"
).agg(ts_last=pl.col("timestamp").last())
# Panics: datetime '2024-03-10 02:00:00' is non-existent in time zone 'America/New_York'

Cause

The BoundsIter in polars-time calls .unwrap() on duration addition when computing window boundaries. With a 17-day period starting from Feb 7, the window boundaries include March 10, 2024 2:00 AM—a non-existent time in America/New_York when DST springs forward to 3:00 AM. This triggers try_localize_datetime to raise an error with NonExistent::Raise.

Fix

Added fallback helper functions that catch non-existent datetime errors during window boundary calculations. When a boundary falls on a DST gap, the code skips forward by 1 hour and retries. This maintains performance in the common case (just a Result check) while gracefully handling DST transitions.

wtn avatar Nov 22 '25 04:11 wtn

Codecov Report

:x: Patch coverage is 91.51671% with 33 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 79.68%. Comparing base (f6a9da8) to head (fd4bb14).

Files with missing lines Patch % Lines
crates/polars-time/src/windows/window.rs 81.50% 27 Missing :warning:
crates/polars-time/src/windows/test.rs 98.22% 4 Missing :warning:
crates/polars-time/src/windows/group_by.rs 88.88% 2 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #25459      +/-   ##
==========================================
- Coverage   80.57%   79.68%   -0.90%     
==========================================
  Files        1764     1764              
  Lines      242864   243204     +340     
  Branches     3044     3044              
==========================================
- Hits       195697   193804    -1893     
- Misses      46385    48618    +2233     
  Partials      782      782              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Nov 22 '25 04:11 codecov[bot]

I feel this is a bit dangerous. We don't check if the error is indeed a DST transition, and even if true I'm not sure if skipping a full hour is always the correct behavior.

orlp avatar Nov 25 '25 12:11 orlp

@orlp Applied your feedback. The fix now uses a fallback helper that catches non-existent datetime errors during window boundary calculations and skips forward past the DST gap when a boundary falls on it.

wtn avatar Dec 01 '25 20:12 wtn