siphon icon indicating copy to clipboard operation
siphon copied to clipboard

Endian-ness issues when selecting on a coordinate

Open banesullivan opened this issue 2 years ago • 3 comments

error:

ValueError: Big-endian buffer not supported on little-endian compiler
from siphon.catalog import TDSCatalog
import xarray as xr

catUrl = "https://thredds.ucar.edu/thredds/catalog/grib/NCEP/GFS/Global_0p25deg_ana/catalog.xml"
datasetName = "Full Collection Dataset"

catalog = TDSCatalog(catUrl)
ds = catalog.datasets[datasetName].remote_access(use_xarray=True)

da = ds['Temperature_isobaric']

# bottom, left, top, right
box = (-96.7030229414862, 90-64.98056667458044, -7.397202215623508, 90+23.180353571106195)

min_lat, min_lon, max_lat, max_lon = box

roi = da.loc[{'lat': slice(min_lat, max_lat), 'lon': slice(min_lon, max_lon)}]
roi

Full stack trace:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[1], line 17
     13 box = (-96.7030229414862, 90-64.98056667458044, -7.397202215623508, 90+23.180353571106195)
     15 min_lat, min_lon, max_lat, max_lon = box
---> 17 roi = da.loc[{'lat': slice(min_lat, max_lat), 'lon': slice(min_lon, max_lon)}]
     18 roi

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/xarray/core/dataarray.py:213, in _LocIndexer.__getitem__(self, key)
    211     labels = indexing.expanded_indexer(key, self.data_array.ndim)
    212     key = dict(zip(self.data_array.dims, labels))
--> 213 return self.data_array.sel(key)

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/xarray/core/dataarray.py:1549, in DataArray.sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   1439 def sel(
   1440     self: T_DataArray,
   1441     indexers: Mapping[Any, Any] | None = None,
   (...)
   1445     **indexers_kwargs: Any,
   1446 ) -> T_DataArray:
   1447     """Return a new DataArray whose data is given by selecting index
   1448     labels along the specified dimension(s).
   1449 
   (...)
   1547     Dimensions without coordinates: points
   1548     """
-> 1549     ds = self._to_temp_dataset().sel(
   1550         indexers=indexers,
   1551         drop=drop,
   1552         method=method,
   1553         tolerance=tolerance,
   1554         **indexers_kwargs,
   1555     )
   1556     return self._from_temp_dataset(ds)

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/xarray/core/dataset.py:2642, in Dataset.sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   2581 """Returns a new dataset with each array indexed by tick labels
   2582 along the specified dimension(s).
   2583 
   (...)
   2639 DataArray.sel
   2640 """
   2641 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel")
-> 2642 query_results = map_index_queries(
   2643     self, indexers=indexers, method=method, tolerance=tolerance
   2644 )
   2646 if drop:
   2647     no_scalar_variables = {}

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/xarray/core/indexing.py:190, in map_index_queries(obj, indexers, method, tolerance, **indexers_kwargs)
    188         results.append(IndexSelResult(labels))
    189     else:
--> 190         results.append(index.sel(labels, **options))
    192 merged = merge_sel_results(results)
    194 # drop dimension coordinates found in dimension indexers
    195 # (also drop multi-index if any)
    196 # (.sel() already ensures alignment)

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/xarray/core/indexes.py:455, in PandasIndex.sel(self, labels, method, tolerance)
    452 coord_name, label = next(iter(labels.items()))
    454 if isinstance(label, slice):
--> 455     indexer = _query_slice(self.index, label, coord_name, method, tolerance)
    456 elif is_dict_like(label):
    457     raise ValueError(
    458         "cannot use a dict-like object for selection on "
    459         "a dimension that does not have a MultiIndex"
    460     )

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/xarray/core/indexes.py:218, in _query_slice(index, label, coord_name, method, tolerance)
    214 if method is not None or tolerance is not None:
    215     raise NotImplementedError(
    216         "cannot use ``method`` argument if any indexers are slice objects"
    217     )
--> 218 indexer = index.slice_indexer(
    219     _sanitize_slice_element(label.start),
    220     _sanitize_slice_element(label.stop),
    221     _sanitize_slice_element(label.step),
    222 )
    223 if not isinstance(indexer, slice):
    224     # unlike pandas, in xarray we never want to silently convert a
    225     # slice indexer into an array indexer
    226     raise KeyError(
    227         "cannot represent labeled-based slice indexer for coordinate "
    228         f"{coord_name!r} with a slice over integer positions; the index is "
    229         "unsorted or non-unique"
    230     )

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/pandas/core/indexes/base.py:6341, in Index.slice_indexer(self, start, end, step)
   6297 def slice_indexer(
   6298     self,
   6299     start: Hashable | None = None,
   6300     end: Hashable | None = None,
   6301     step: int | None = None,
   6302 ) -> slice:
   6303     """
   6304     Compute the slice indexer for input labels and step.
   6305 
   (...)
   6339     slice(1, 3, None)
   6340     """
-> 6341     start_slice, end_slice = self.slice_locs(start, end, step=step)
   6343     # return a slice
   6344     if not is_scalar(start_slice):

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/pandas/core/indexes/base.py:6534, in Index.slice_locs(self, start, end, step)
   6532 start_slice = None
   6533 if start is not None:
-> 6534     start_slice = self.get_slice_bound(start, "left")
   6535 if start_slice is None:
   6536     start_slice = 0

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/pandas/core/indexes/base.py:6453, in Index.get_slice_bound(self, label, side)
   6451 # we need to look up the label
   6452 try:
-> 6453     slc = self.get_loc(label)
   6454 except KeyError as err:
   6455     try:

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/pandas/core/indexes/base.py:3652, in Index.get_loc(self, key)
   3650 casted_key = self._maybe_cast_indexer(key)
   3651 try:
-> 3652     return self._engine.get_loc(casted_key)
   3653 except KeyError as err:
   3654     raise KeyError(key) from err

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/pandas/_libs/index.pyx:147, in pandas._libs.index.IndexEngine.get_loc()

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/pandas/_libs/index.pyx:169, in pandas._libs.index.IndexEngine.get_loc()

File ~/.pyenv/versions/pythia/lib/python3.11/site-packages/pandas/_libs/index.pyx:303, in pandas._libs.index.IndexEngine._ensure_mapping_populated()

File pandas/_libs/hashtable_class_helper.pxi:7104, in pandas._libs.hashtable.PyObjectHashTable.map_locations()

ValueError: Big-endian buffer not supported on little-endian compiler

System Info

  • Problem description: I'm trying to select a Lat/Lon region from a dataset
  • Expected output: a selected DataArray
  • Which platform (Linux, Windows, Mac, etc.): Linux
  • Versions. Include the output of:
    • python --version: Python 3.11.0
    • `python -c 'import siphon; print(siphon.version)' 0.9

Other system info:

--------------------------------------------------------------------------------
  Date: Thu Jun 22 22:16:47 2023 EDT

                OS : Linux
            CPU(s) : 20
           Machine : x86_64
      Architecture : 64bit
               RAM : 93.0 GiB
       Environment : Python
       File system : ext4

  Python 3.11.0 (main, Jan  9 2023, 10:26:36) [GCC 9.4.0]

            xarray : 2023.5.0
             numpy : 1.25.0
            pandas : 2.0.2
            siphon : 0.9
             scipy : 1.10.1
           IPython : 8.14.0
        matplotlib : 3.7.1
            scooby : 0.7.2
--------------------------------------------------------------------------------

banesullivan avatar Jun 23 '23 02:06 banesullivan

This very well could be independent of Siphon, but since I'm using Thredds data, I thought I'd report here per @ThomasMGeo's suggestion

banesullivan avatar Jun 23 '23 02:06 banesullivan

This also occurs with da.sel({'isobaric': 6.50e+04}) to rule out the lon/lat min/max bounds

banesullivan avatar Jun 23 '23 05:06 banesullivan

Looks like this is caused by Pandas 2. So for a work-around, install pandas <2.

dopplershift avatar Jul 05 '23 23:07 dopplershift