siphon icon indicating copy to clipboard operation
siphon copied to clipboard

filter_time_nearest : TypeError: can't subtract offset-naive and offset-aware datetimes

Open captcha1 opened this issue 4 years ago • 2 comments

import datetime
import pytz

tz0 = pytz.timezone('US/Pacific')
time0 = tz0.localize(datetime.datetime.now())

#OK#time0 = datetime.datetime.utcnow()

stn0 = 'KMUX'
nexrad2 = 'https://thredds-test.unidata.ucar.edu/thredds/catalog/nexrad/level2/{stn}/{date:%Y%m%d}/catalog.xml'

import siphon.catalog

tdscat1 = siphon.catalog.TDSCatalog(nexrad2.format(stn=stn0, date=time0))
tds1 = tdscat1.datasets.filter_time_nearest(time0)

print(tds1.access_urls['OPENDAP'])

Outputs :

Traceback (most recent call last):
  File "/tmp/filter_time_nearest-bug.py", line 15, in <module>
    tds1 = tdscat1.datasets.filter_time_nearest(time0)
  File "$p/lib/python3.10/site-packages/siphon/catalog.py", line 114, in filter_time_nearest
    return min(self._get_datasets_with_times(regex, strptime),
  File "$p/lib/python3.10/site-packages/siphon/catalog.py", line 115, in <lambda>
    key=lambda i: abs((i[0] - time).total_seconds()))[-1]
TypeError: can't subtract offset-naive and offset-aware datetimes
  • Problem description : Perhaps filter_time_nearest should accept "offset-aware datetimes" ...

  • Expected output : https://thredds-test.unidata.ucar.edu/thredds/dodsC/nexrad/level2/KMUX/20220212/Level2_KMUX_20220212_063207.ar2v

  • Which platform : Fedora 35 !

  • Versions. Include the output of:

    • python --version : Python 3.10.2
    • `python -c 'import siphon; print(siphon.version)' : 0.9

captcha1 avatar Feb 12 '22 06:02 captcha1

This will fix it :

--- catalog.py-orig     2020-10-28 00:25:15.000000000 -0700
+++ catalog.py  2022-02-11 23:45:07.145859536 -0800
@@ -9,6 +9,7 @@
 
 from collections import OrderedDict
 from datetime import datetime
+from datetime import timezone
 import logging
 import re
 import warnings
@@ -111,6 +112,8 @@
             The value with a time closest to that desired
 
         """
+        time = time.astimezone(timezone.utc)
+        time = time.replace(tzinfo=None)
         return min(self._get_datasets_with_times(regex, strptime),
                    key=lambda i: abs((i[0] - time).total_seconds()))[-1]

Dunno if this is the "best" fix or if it fixes similar problems.

Also, in my code, I do : nexrad2.format(stn=stn0, date=time0) ... so that "time0" has to be UTC as well, so I might as well put the astimezone(timezone.utc) in my code anyway ...

captcha1 avatar Feb 12 '22 08:02 captcha1

I appreciate you running down the fix. The problem is that this function really has to be kind of naive. We really have no idea what the timezone is for the times encoded in the dataset filenames. One would hope they're in UTC, but neither Siphon nor the TDS can enforce this (they're just the filenames on disk). So the simplest API is to have the user submit time as a timezone-less datetime that matches the native timezone for the datasets.

I'm open to suggestions on robust ways to improve this, but unfortunately the problem we're trying to solve is pretty unconstrained.

dopplershift avatar Feb 14 '22 18:02 dopplershift