hamilton
hamilton copied to clipboard
Hamilton tracker gives an error with datetime columns in polars dataframes
Reproduction
from datetime import datetime
import polars as pl
from hamilton import driver
from hamilton_sdk import adapters
import __main__ as dag
def df() -> pl.Series:
return pl.Series(
"timestamp",
[
datetime(2021, 1, 1),
datetime(2021, 1, 2),
datetime(2021, 1, 3),
],
)
tracker = adapters.HamiltonTracker(
project_id=1,
username="elyase",
dag_name="polars",
)
dr = driver.Builder().with_modules(dag).with_adapters(tracker).build()
result = dr.execute(["df"])
Stack Traces
File "/Users/yaser/Documents/shitcoins/.venv/lib/python3.12/site-packages/hamilton/node.py", line 249, in __call__
return self.callable(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/elyase/test/.venv/lib/python3.12/site-packages/hamilton_sdk/tracking/polars_col_stats.py", line 52, in std
return col.std()
^^^^^^^^^
File "/Users/elyase/test/.venv/lib/python3.12/site-packages/polars/series/series.py", line 2049, in std
return self._s.std(ddof)
^^^^^^^^^^^^^^^^^
polars.exceptions.InvalidOperationError: `std` operation not supported for dtype `datetime[μs]`
Thanks @elyase ! We might not have kept up with all the polars changes.
Do you have an example dataframe this breaks on?
hi @skrawcz, thanks for your reply, I edited the ticket with a reproduction