polars
polars copied to clipboard
Can't cast +00:00 to Europe/Brussels
Polars version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of Polars.
Issue description
Can't cast +00:00 timezone to Europe/Brussels
Reproducible example
In [13]: pl.Series(['2020-01-01']).str.strptime(pl.Datetime).dt.with_time_zone('+01:00').dt.cast_time_zone('Europe/Brussels')
---------------------------------------------------------------------------
ComputeError Traceback (most recent call last)
Cell In[13], line 1
----> 1 pl.Series(['2020-01-01']).str.strptime(pl.Datetime).dt.with_time_zone('+01:00').dt.cast_time_zone('Europe/Brussels')
File ~/tmp/.311venv/lib/python3.11/site-packages/polars/internals/series/utils.py:98, in call_expr.<locals>.wrapper(self, *args, **kwargs)
96 expr = getattr(expr, namespace)
97 f = getattr(expr, func.__name__)
---> 98 return s.to_frame().select(f(*args, **kwargs)).to_series()
File ~/tmp/.311venv/lib/python3.11/site-packages/polars/internals/dataframe/frame.py:5592, in DataFrame.select(self, exprs)
5504 def select(
5505 self: DF,
5506 exprs: (
(...)
5511 ),
5512 ) -> DF:
5513 """
5514 Select columns from this DataFrame.
5515
(...)
5589
5590 """
5591 return self._from_pydf(
-> 5592 self.lazy().select(exprs).collect(no_optimization=True)._df
5593 )
File ~/tmp/.311venv/lib/python3.11/site-packages/polars/utils.py:394, in deprecated_alias.<locals>.deco.<locals>.wrapper(*args, **kwargs)
391 @functools.wraps(fn)
392 def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
393 _rename_kwargs(fn.__name__, kwargs, aliases)
--> 394 return fn(*args, **kwargs)
File ~/tmp/.311venv/lib/python3.11/site-packages/polars/internals/lazyframe/frame.py:1168, in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, no_optimization, slice_pushdown, common_subplan_elimination, streaming)
1157 common_subplan_elimination = False
1159 ldf = self._ldf.optimization_toggle(
1160 type_coercion,
1161 predicate_pushdown,
(...)
1166 streaming,
1167 )
-> 1168 return pli.wrap_df(ldf.collect())
ComputeError: Could not parse timezone: 'Europe/Brussels'
Expected behavior
shape: (1,)
Series: '' [datetime[μs, Europe/Brussels]]
[
2020-01-01 01:00:00 CET
]
Also, the error message should probably say
ComputeError: Could not parse timezone: '+01:00'
as that's the part which can't be parsed by chrono_tz?
Installed versions
---Version info---
Polars: 0.15.16
Index type: UInt32
Platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python: 3.11.1 (main, Dec 7 2022, 01:11:34) [GCC 11.3.0]
---Optional dependencies---
pyarrow: 10.0.1
pandas: 1.5.3
numpy: 1.24.1
fsspec: <not installed>
connectorx: <not installed>
xlsx2csv: <not installed>
deltalake: <not installed>
matplotlib: <not installed>
related: https://github.com/pola-rs/polars/issues/6338 - It seems there is some confusion, what cast_time_zone
and with_time_zone
should do.
Assuming
with_time_zone: set tz (localize)
cast_time_zone: convert tz,
All values are stored in UTC in memory by polars. Upon showing data to the users, the UTC times are corrected for the timezone when showed to users/exporting.
with_time_zone
leaves the values as is and simply sets a different timezone on the column
cast_time_zone
take the current time zone and the given time zone and shifts the underlying values by the difference.
That's the idea at least. :)
@ritchie46 sorry, didn't want to cause more confusion than there actually is (post edited).
Kinda on the way to having something working for this, getting there 💪