polars
polars copied to clipboard
Deprecate some flexibility in with_time_zone
Problem description
The current state of timezone conversions in polars is:
-
tz_localize
: tz-naive => tz-aware -
with_time_zone
: tz-naive | tz-aware => tz-aware | tz-naive (converts to/from timezone or UTC) -
cast_time_zone
: tz-aware => tz-aware | tz-naive (sets or removes timezone)
The second one is quite flexible - perhaps too much so?
with_time_zone
on a tz-naive converts as if it were starting from UTC:
In [8]: tz_naive
Out[8]:
shape: (1,)
Series: '' [datetime[μs]]
[
2020-01-01 03:00:00
]
In [9]: tz_naive.dt.with_time_zone('America/Barbados')
Out[9]:
shape: (1,)
Series: '' [datetime[μs, America/Barbados]]
[
2019-12-31 23:00:00 AST
]
I think this is rarely what people are actually looking for - instead, they probably wanted tz.localize
:
In [10]: tz_naive.dt.tz_localize('America/Barbados')
Out[10]:
shape: (1,)
Series: '' [datetime[μs, America/Barbados]]
[
2020-01-01 03:00:00 AST
]
Suggestion
I'd like to suggest making the second one stricter, so that one has:
-
tz_localize
: tz-naive => tz-aware (sets time zone) -
with_time_zone
: tz-aware => tz-aware (converts time zone) -
cast_time_zone
: tz-aware => tz-aware | tz-naive (changes or removes time zone)
This would be simpler and more predictable
And in the (rare?) case someone really did want the current with_time_zone
on tz-naive behaviour, they can just do
In [12]: tz_naive.dt.tz_localize('UTC').dt.with_time_zone('America/Barbados')
Out[12]:
shape: (1,)
Series: '' [datetime[μs, America/Barbados]]
[
2019-12-31 23:00:00 AST
]
which is more explicit.
So, concretely, the two behaviours I'm suggesting to deprecate are:
- calling
with_time_zone
on tz-naive; - calling
with_time_zone(None)