Timezone Aware Datetime with Python>=3.12
What went wrong?
In Python 3.12 the datetime.utcnow() functionality is removed. The solution is to move to timezone aware datetime objects with
import datetime as datetime, UTC
date = datetime.now(UTC)
The issue that comes up is that our data that we bring in, whether surface, upper air, or gridded have traditionally not been timezone aware. There are many instances of setting a datetime, whether with utcnow() or with a manual setting (e.g., datetime(2017, 3, 8, 12) that are/were timezone naive. The issue comes in when comparing timezone aware to timezone naive values as that comparison cannot be done. This is most likely to come up when subsetting data for a certain time or time window.
This issue captures our need to address this with the challenge being in determining how to address this change holistically. Some changes may need to be in MetPy, while others lie upstream with other packages or might require code examples to highlight how to get the proper object type for users.
Operating System
MacOS
Version
1.5.1
Python Version
3.12
Code to Reproduce
from datetime import datetime, UTC
import xarray as xr
from metpy.units import units
date = datetime(2023, 11, 30, 12, tzinfo=UTC)
# Get GFS data for contouring
ds = xr.open_dataset('https://thredds.ucar.edu/thredds/dodsC/grib/NCEP/GFS/'
f'Global_onedeg_ana/GFS_Global_onedeg_ana_{date:%Y%m%d_%H%M}.grib2')
ds.Geopotential_height_isobaric.metpy.sel(time=date, vertical=850*units.hPa)
Errors, Traceback, and Logs
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
File ~/miniconda3/envs/main/lib/python3.11/site-packages/pandas/core/indexes/datetimes.py:579, in DatetimeIndex._disallow_mismatched_indexing(self, key)
577 try:
578 # GH#36148
--> 579 self._data._assert_tzawareness_compat(key)
580 except TypeError as err:
File ~/miniconda3/envs/main/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py:770, in DatetimeArray._assert_tzawareness_compat(self, other)
769 if other_tz is not None:
--> 770 raise TypeError(
771 "Cannot compare tz-naive and tz-aware datetime-like objects."
772 )
773 elif other_tz is None:
TypeError: Cannot compare tz-naive and tz-aware datetime-like objects.
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
File ~/miniconda3/envs/main/lib/python3.11/site-packages/xarray/core/indexes.py:769, in PandasIndex.sel(self, labels, method, tolerance)
768 try:
--> 769 indexer = self.index.get_loc(label_value)
770 except KeyError as e:
File ~/miniconda3/envs/main/lib/python3.11/site-packages/pandas/core/indexes/datetimes.py:599, in DatetimeIndex.get_loc(self, key)
597 if isinstance(key, self._data._recognized_scalars):
598 # needed to localize naive datetimes
--> 599 self._disallow_mismatched_indexing(key)
600 key = Timestamp(key)
File ~/miniconda3/envs/main/lib/python3.11/site-packages/pandas/core/indexes/datetimes.py:581, in DatetimeIndex._disallow_mismatched_indexing(self, key)
580 except TypeError as err:
--> 581 raise KeyError(key) from err
KeyError: datetime.datetime(2023, 11, 30, 12, 0, tzinfo=datetime.timezone.utc)
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
Cell In[56], line 11
7 # Get GFS data for contouring
8 ds = xr.open_dataset('https://thredds.ucar.edu/thredds/dodsC/grib/NCEP/GFS/'
9 f'Global_onedeg_ana/GFS_Global_onedeg_ana_{date:%Y%m%d_%H%M}.grib2')
---> 11 ds.Geopotential_height_isobaric.metpy.sel(time=date, vertical=850*units.hPa)
File ~/miniconda3/envs/main/lib/python3.11/site-packages/metpy/xarray.py:644, in MetPyDataArrayAccessor.sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
642 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, 'sel')
643 indexers = _reassign_quantity_indexer(self._data_array, indexers)
--> 644 return self._data_array.sel(indexers, method=method, tolerance=tolerance, drop=drop)
File ~/miniconda3/envs/main/lib/python3.11/site-packages/xarray/core/dataarray.py:1582, in DataArray.sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
1472 def sel(
1473 self: T_DataArray,
1474 indexers: Mapping[Any, Any] | None = None,
(...)
1478 **indexers_kwargs: Any,
1479 ) -> T_DataArray:
1480 """Return a new DataArray whose data is given by selecting index
1481 labels along the specified dimension(s).
1482
(...)
1580 Dimensions without coordinates: points
1581 """
-> 1582 ds = self._to_temp_dataset().sel(
1583 indexers=indexers,
1584 drop=drop,
1585 method=method,
1586 tolerance=tolerance,
1587 **indexers_kwargs,
1588 )
1589 return self._from_temp_dataset(ds)
File ~/miniconda3/envs/main/lib/python3.11/site-packages/xarray/core/dataset.py:3020, in Dataset.sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
2959 """Returns a new dataset with each array indexed by tick labels
2960 along the specified dimension(s).
2961
(...)
3017 DataArray.sel
3018 """
3019 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel")
-> 3020 query_results = map_index_queries(
3021 self, indexers=indexers, method=method, tolerance=tolerance
3022 )
3024 if drop:
3025 no_scalar_variables = {}
File ~/miniconda3/envs/main/lib/python3.11/site-packages/xarray/core/indexing.py:190, in map_index_queries(obj, indexers, method, tolerance, **indexers_kwargs)
188 results.append(IndexSelResult(labels))
189 else:
--> 190 results.append(index.sel(labels, **options))
192 merged = merge_sel_results(results)
194 # drop dimension coordinates found in dimension indexers
195 # (also drop multi-index if any)
196 # (.sel() already ensures alignment)
File ~/miniconda3/envs/main/lib/python3.11/site-packages/xarray/core/indexes.py:771, in PandasIndex.sel(self, labels, method, tolerance)
769 indexer = self.index.get_loc(label_value)
770 except KeyError as e:
--> 771 raise KeyError(
772 f"not all values found in index {coord_name!r}. "
773 "Try setting the `method` keyword argument (example: method='nearest')."
774 ) from e
776 elif label_array.dtype.kind == "b":
777 indexer = label_array
KeyError: "not all values found in index 'time'. Try setting the `method` keyword argument (example: method='nearest')."
So the whole situation is kind of a mess. Pandas apparent does actually support timezones but that gets stored under its own Dtype; numpy datetime64 is defined to be naive; and xarray is caught in the middle as well. There is probably a role we could potentially play trying to contribute, but I personally do not have a great handle on all the potential pitfalls that are at play--I only know enough to know that dates & times & timezones are a huge pain.
I'm not even sure how you could open a netCDF file and make the time axis set the timezone to UTC, but it seems to be possible?
Given that, for MetPy I'm coming around to the idea that while we should address the warnings and avoid deprecated methods (#3255), we should keep using tz-naive datetime instances internally. So e.g.:
datetime.utcnow()
becomes
datetime.now(timezone.utc).replace(tzinfo=None)
That feels ugly to drop timezone information (which really is nice to have for robust labelling, etc.), but would preserve the existing behavior and avoid a bunch of headaches for our users until the rest of the ecosystem is more ready to deal with them.
So for completeness, #3255 implements this using a sensible mix:
- Internally used
datetime(e.g. directly passed to string formatting) stay tz-aware when generated - Stored and user-visible
datetimeinstances have the timezone information dropped to avoid changing behavior and creating the problems outlined above.
Is there anything more we need to do in MetPy itself about this?