VirtualiZarr icon indicating copy to clipboard operation
VirtualiZarr copied to clipboard

[HDFVirtualBackend] - `OSError: Generic HTTP error: Request error: Error performing GET <> in 106.490666ms - HTTP error: error sending request`

Open norlandrhagen opened this issue 7 months ago • 5 comments

Ran into this error when trying to open a NetCDF over https on develop 270a484.

OSError: Generic HTTP error: Request error: Error performing GET https://data-dataref.ifremer.fr/ww3/GLOBMULTI_ERA5_GLOBCUR_01/GLOB-30M/2024/FIELD_NC/LOPS_WW3-GLOB-30M_202401.nc in 106.490666ms - HTTP error: error sending request

from virtualizarr import open_virtual_dataset
from virtualizarr.backends import HDFVirtualBackend


url = 'https://data-dataref.ifremer.fr/ww3/GLOBMULTI_ERA5_GLOBCUR_01/GLOB-30M/2024/FIELD_NC/LOPS_WW3-GLOB-30M_202401.nc'


vds = open_virtual_dataset(url, backend=HDFVirtualBackend)


----> 1 vds = open_virtual_dataset(url, backend=HDFVirtualBackend)

File ~/Documents/carbonplan/leap/feedstocks/wavewatch3/.venv/lib/python3.11/site-packages/virtualizarr/backend.py:210, in open_virtual_dataset(filepath, filetype, group, drop_variables, loadable_variables, decode_times, cftime_variables, indexes, virtual_backend_kwargs, reader_options, backend)
    207 if backend_cls is None:
    208     raise NotImplementedError(f"Unsupported file type: {filetype.name}")
--> 210 vds = backend_cls.open_virtual_dataset(
    211     filepath,
    212     group=group,
    213     drop_variables=drop_variables,
    214     loadable_variables=loadable_variables,
    215     decode_times=decode_times,
    216     indexes=indexes,
    217     virtual_backend_kwargs=virtual_backend_kwargs,
    218     reader_options=reader_options,
    219 )
    221 return vds

File ~/Documents/carbonplan/leap/feedstocks/wavewatch3/.venv/lib/python3.11/site-packages/virtualizarr/readers/hdf/hdf.py:217, in HDFVirtualBackend.open_virtual_dataset(filepath, group, drop_variables, loadable_variables, decode_times, indexes, virtual_backend_kwargs, reader_options)
    209 filepath = validate_and_normalize_path_to_uri(
    210     filepath, fs_root=Path.cwd().as_uri()
    211 )
    213 _drop_vars: Iterable[str] = (
    214     [] if drop_variables is None else list(drop_variables)
    215 )
--> 217 manifest_store = HDFVirtualBackend._create_manifest_store(
    218     filepath=filepath,
    219     drop_variables=_drop_vars,
    220     group=group,
    221 )
    222 ds = manifest_store.to_virtual_dataset(
    223     loadable_variables=loadable_variables,
    224     decode_times=decode_times,
    225     indexes=indexes,
    226 )
    227 return ds

File ~/Documents/carbonplan/leap/feedstocks/wavewatch3/.venv/lib/python3.11/site-packages/virtualizarr/readers/hdf/hdf.py:181, in HDFVirtualBackend._create_manifest_store(filepath, store, group, drop_variables)
    179 if not store:
    180     store = default_object_store(filepath)  # type: ignore
--> 181 manifest_group = HDFVirtualBackend._construct_manifest_group(
    182     store=store,
    183     filepath=filepath,
    184     group=group,
    185     drop_variables=drop_variables,
    186 )
    187 registry = ObjectStoreRegistry({filepath: store})
    188 # Convert to a manifest store

File ~/Documents/carbonplan/leap/feedstocks/wavewatch3/.venv/lib/python3.11/site-packages/virtualizarr/readers/hdf/hdf.py:154, in HDFVirtualBackend._construct_manifest_group(store, filepath, group, drop_variables)
    147 manifest_dict = {}
    148 # Several of our test fixtures which use xr.tutorial data have
    149 # non coord dimensions serialized using big endian dtypes which are not
    150 # yet supported in zarr-python v3.  We'll drop these variables for the
    151 # moment until big endian support is included upstream.)
    153 non_coordinate_dimension_vars = (
--> 154     HDFVirtualBackend._find_non_coord_dimension_vars(group=g)
    155 )
    156 drop_variables = list(set(list(drop_variables) + non_coordinate_dimension_vars))
    157 attrs = HDFVirtualBackend._extract_attrs(g)

File ~/Documents/carbonplan/leap/feedstocks/wavewatch3/.venv/lib/python3.11/site-packages/virtualizarr/readers/hdf/hdf.py:398, in HDFVirtualBackend._find_non_coord_dimension_vars(group)
    396 non_coordinate_dimension_variables = []
    397 for name, obj in group.items():
--> 398     if "_Netcdf4Dimid" in obj.attrs:
    399         dimension_names.append(name)
    400 for name, obj in group.items():

File h5py/_objects.pyx:54, in h5py._objects.with_phil.wrapper()

File h5py/_objects.pyx:55, in h5py._objects.with_phil.wrapper()

File ~/Documents/carbonplan/leap/feedstocks/wavewatch3/.venv/lib/python3.11/site-packages/h5py/_hl/attrs.py:272, in AttributeManager.__contains__(self, name)
    269 @with_phil
    270 def __contains__(self, name):
    271     """ Determine if an attribute exists, by name. """
--> 272     return h5a.exists(self._id, self._e(name))

File h5py/_objects.pyx:54, in h5py._objects.with_phil.wrapper()

File h5py/_objects.pyx:55, in h5py._objects.with_phil.wrapper()

File h5py/h5a.pyx:103, in h5py.h5a.exists()

File h5py/h5fd.pyx:164, in h5py.h5fd.H5FD_fileobj_read()

File ~/Documents/carbonplan/leap/feedstocks/wavewatch3/.venv/lib/python3.11/site-packages/virtualizarr/utils.py:38, in ObstoreReader.read(self, size)
     37 def read(self, size: int, /) -> bytes:
---> 38     return self._reader.read(size).to_bytes()

OSError: Generic HTTP error: Request error: Error performing GET https://data-dataref.ifremer.fr/ww3/GLOBMULTI_ERA5_GLOBCUR_01/GLOB-30M/2024/FIELD_NC/LOPS_WW3-GLOB-30M_202401.nc in 106.490666ms - HTTP error: error sending request

Opening with fsspec + h5netcdf:


url = 'https://data-dataref.ifremer.fr/ww3/GLOBMULTI_ERA5_GLOBCUR_01/GLOB-30M/2024/FIELD_NC/LOPS_WW3-GLOB-30M_202401.nc'
fs = fsspec.filesystem('http')
ds = xr.open_dataset(fs.open(path), engine='h5netcdf')

<xarray.Dataset> Size: 20GB
Dimensions:    (longitude: 720, latitude: 323, time: 248)
Coordinates:
  * longitude  (longitude) float32 3kB -180.0 -179.5 -179.0 ... 179.0 179.5
  * latitude   (latitude) float32 1kB -78.0 -77.5 -77.0 -76.5 ... 82.0 82.5 83.0
  * time       (time) datetime64[ns] 2kB 2024-01-01 ... 2024-01-31T21:00:00
Data variables: (12/86)
    MAPSTA     (latitude, longitude) int16 465kB ...
    dpt        (time, latitude, longitude) float32 231MB ...
    ucur       (time, latitude, longitude) float32 231MB ...
    vcur       (time, latitude, longitude) float32 231MB ...
    uwnd       (time, latitude, longitude) float32 231MB ...
    vwnd       (time, latitude, longitude) float32 231MB ...
    ...         ...
    vabr       (time, latitude, longitude) float32 231MB ...
    uubr       (time, latitude, longitude) float32 231MB ...
    vubr       (time, latitude, longitude) float32 231MB ...
    mssu       (time, latitude, longitude) float32 231MB ...
    mssc       (time, latitude, longitude) float32 231MB ...
    mssd       (time, latitude, longitude) float32 231MB ...

"""


norlandrhagen avatar Apr 28 '25 16:04 norlandrhagen

My understanding is that with the obstore changes you don't need to use fsspec explicitly here, you just pass the path string

TomNicholas avatar Apr 28 '25 17:04 TomNicholas

Totally. The fsspec bit was just to check if I could open up the file remotely with Xarray.

norlandrhagen avatar Apr 28 '25 18:04 norlandrhagen

Oh sorry I must have misread your original report. Yeah IDK.

@maxrjones ? @kylebarron 😅

TomNicholas avatar Apr 28 '25 19:04 TomNicholas

that server is known to be flaky sometimes (it's the one my local HPC / computing department is managing), so it may have been a temporary issue: I can't reproduce the failure right now, but it does take a unreasonable amount of time (I didn't time that, though).

keewis avatar May 05 '25 22:05 keewis

Ran into this error when trying to open a NetCDF over https on develop 270a484.

OSError: Generic HTTP error: Request error: Error performing GET https://data-dataref.ifremer.fr/ww3/GLOBMULTI_ERA5_GLOBCUR_01/GLOB-30M/2024/FIELD_NC/LOPS_WW3-GLOB-30M_202401.nc in 106.490666ms - HTTP error: error sending request

With latest obstore I think there should be more debugging output after that generic HTTP error: error sending request?

kylebarron avatar May 06 '25 00:05 kylebarron