spans
spans copied to clipboard
Unhelpful error for `&` operator on TZ-mismatched datetimerangesets
Version: Spans 1.0.2, 1.1.0
OS: Windows 10
Problem: When intersecting two datetimerangesets, one TZ-aware, the other TZ-naïve, we get a different message depending upon whether we use the .intersection() method or the set intersection symbol &. Using & gives the error TypeError: unsupported operand type(s) for &: 'datetimerangeset' and 'datetimerangeset'. Using .intersection() gives TypeError: Cannot compare tz-naive and tz-aware timestamps
Expected behavior: TypeError: Cannot compare tz-naive and tz-aware timestamps whether using .intersection() or &.
MWE:
>>> import pandas as pd
>>> from spans import datetimerange, datetimerangeset
>>> begin1 = pd.to_datetime('2016')
>>> begin2 = pd.to_datetime('2017').tz_localize('UTC')
>>> end1 = pd.to_datetime('2018')
>>> end2 = pd.to_datetime('2019').tz_localize('UTC')
>>> r1 = datetimerange(begin1, end1)
>>> r2 = datetimerange(begin2, end2)
>>> s1 = datetimerangeset([r1])
>>> s2 = datetimerangeset([r2])
>>> s1 & s2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for &: 'datetimerangeset' and 'datetimerangeset'
>>> s1.intersection(s2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\600301\Documents\GitHub\test\delete\lib\site-packages\spans\settypes.py", line 548, in intersection
intersection.add(a.intersection(b))
File "C:\Users\600301\Documents\GitHub\test\delete\lib\site-packages\spans\types.py", line 661, in intersection
if not self or not other or not self.overlap(other):
File "C:\Users\600301\Documents\GitHub\test\delete\lib\site-packages\spans\types.py", line 504, in overlap
return sl < ou and ol < su
File "C:\Users\600301\Documents\GitHub\test\delete\lib\site-packages\spans\types.py", line 57, in __lt__
if self.value < other.value:
File "pandas\_libs\tslibs\c_timestamp.pyx", line 122, in pandas._libs.tslibs.c_timestamp._Timestamp.__richcmp__
File "pandas\_libs\tslibs\c_timestamp.pyx", line 182, in pandas._libs.tslibs.c_timestamp._Timestamp._assert_tzawareness_compat
TypeError: Cannot compare tz-naive and tz-aware timestamps
>>> begin1tz = begin1.tz_localize('UTC')
>>> end1tz = end1.tz_localize('UTC')
>>> r1tz = datetimerange(begin1tz, end1tz)
>>> s1tz = datetimerangeset([r1tz])
>>> s1tz & s2
datetimerangeset([datetimerange(Timestamp('2017-01-01 00:00:00+0000', tz='UTC'), Timestamp('2018-01-01 00:00:00+0000', tz='UTC'))])
Thanks for reporting this! It is indeed not a helpful error message. It seems to be because I translate the TypeError to NotImplemented:
https://github.com/runfalk/spans/blob/420a7bd2cfbc5b92c21a008495fb8632cced0a03/spans/settypes.py#L568-L572
I think the best way of fixing this is actually to have one range for naive datetimes and one for time zone aware. Introducing this would unfortunately be backwards incompatible.
The other option I can think of is to special case the datetimerange and range set to bypass the exception catcher and just raise the proper error. I think I need to have a proper think about this.
Given how much chaos time zones have caused programmers, one would hope pandas would extend TypeError to make it easier to except 😛.
What about
- Add boolean attribute to
datetimerangeset, e.g.localized. - Override
__and__()fordatetimerangeset - Keep the exception catcher in place in
datetimerangeset.__and__()and check if bothselfandotherare correct types and are are bothlocalized/not localized. If not, you can stillreturn NotImplemented.
I get that it's a bit inelegant because of type checking, but it would work. I can try to put together a PR?