spans icon indicating copy to clipboard operation
spans copied to clipboard

Unhelpful error for `&` operator on TZ-mismatched datetimerangesets

Open Jacob-Stevens-Haas opened this issue 5 years ago • 2 comments
trafficstars

Version: Spans 1.0.2, 1.1.0 OS: Windows 10 Problem: When intersecting two datetimerangesets, one TZ-aware, the other TZ-naïve, we get a different message depending upon whether we use the .intersection() method or the set intersection symbol &. Using & gives the error TypeError: unsupported operand type(s) for &: 'datetimerangeset' and 'datetimerangeset'. Using .intersection() gives TypeError: Cannot compare tz-naive and tz-aware timestamps Expected behavior: TypeError: Cannot compare tz-naive and tz-aware timestamps whether using .intersection() or &.

MWE:

>>> import pandas as pd
>>> from spans import datetimerange, datetimerangeset
>>> begin1 = pd.to_datetime('2016')
>>> begin2 = pd.to_datetime('2017').tz_localize('UTC')
>>> end1 = pd.to_datetime('2018')
>>> end2 = pd.to_datetime('2019').tz_localize('UTC')
>>> r1 = datetimerange(begin1, end1)
>>> r2 = datetimerange(begin2, end2)
>>> s1 = datetimerangeset([r1])
>>> s2 = datetimerangeset([r2])
>>> s1 & s2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for &: 'datetimerangeset' and 'datetimerangeset'
>>> s1.intersection(s2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\600301\Documents\GitHub\test\delete\lib\site-packages\spans\settypes.py", line 548, in intersection
    intersection.add(a.intersection(b))
  File "C:\Users\600301\Documents\GitHub\test\delete\lib\site-packages\spans\types.py", line 661, in intersection
    if not self or not other or not self.overlap(other):
  File "C:\Users\600301\Documents\GitHub\test\delete\lib\site-packages\spans\types.py", line 504, in overlap
    return sl < ou and ol < su
  File "C:\Users\600301\Documents\GitHub\test\delete\lib\site-packages\spans\types.py", line 57, in __lt__
    if self.value < other.value:
  File "pandas\_libs\tslibs\c_timestamp.pyx", line 122, in pandas._libs.tslibs.c_timestamp._Timestamp.__richcmp__
  File "pandas\_libs\tslibs\c_timestamp.pyx", line 182, in pandas._libs.tslibs.c_timestamp._Timestamp._assert_tzawareness_compat
TypeError: Cannot compare tz-naive and tz-aware timestamps
>>> begin1tz  = begin1.tz_localize('UTC')
>>> end1tz = end1.tz_localize('UTC')
>>> r1tz = datetimerange(begin1tz, end1tz)
>>> s1tz = datetimerangeset([r1tz])
>>> s1tz & s2
datetimerangeset([datetimerange(Timestamp('2017-01-01 00:00:00+0000', tz='UTC'), Timestamp('2018-01-01 00:00:00+0000', tz='UTC'))])

Jacob-Stevens-Haas avatar Jun 26 '20 15:06 Jacob-Stevens-Haas

Thanks for reporting this! It is indeed not a helpful error message. It seems to be because I translate the TypeError to NotImplemented:

https://github.com/runfalk/spans/blob/420a7bd2cfbc5b92c21a008495fb8632cced0a03/spans/settypes.py#L568-L572

I think the best way of fixing this is actually to have one range for naive datetimes and one for time zone aware. Introducing this would unfortunately be backwards incompatible.

The other option I can think of is to special case the datetimerange and range set to bypass the exception catcher and just raise the proper error. I think I need to have a proper think about this.

runfalk avatar Jun 26 '20 21:06 runfalk

Given how much chaos time zones have caused programmers, one would hope pandas would extend TypeError to make it easier to except 😛.

What about

  1. Add boolean attribute to datetimerangeset, e.g. localized.
  2. Override __and__() for datetimerangeset
  3. Keep the exception catcher in place in datetimerangeset.__and__() and check if both self and other are correct types and are are both localized/not localized. If not, you can still return NotImplemented.

I get that it's a bit inelegant because of type checking, but it would work. I can try to put together a PR?

Jacob-Stevens-Haas avatar Jul 09 '20 19:07 Jacob-Stevens-Haas