RuntimeError: NetCDF: HDF error OR Segmentation fault
Describe the bug Hi,
I just got started with satpy and want to try it out with the FCI Level 1c Normal Resolution Image Data - MTG - 0 degree EUMETSAT dataset.
I found an example in the documentation for EUMETSAT MTG (Third generation) data, the fci_l1c_nc reader. However, when I try to use the boilerplate code in the exampleI get two error messages, they alternate randomly from run to run.
Error messages:
- RuntimeError: NetCDF: HDF error
- Segmentation fault
The program runs as expected up until vis_04_values = scn['vis_04'].values
To Reproduce
import hdf5plugin
from satpy.scene import Scene
from satpy import find_files_and_readers
from satpy.utils import debug_on
path_to_data = '/Users/mjodas/Downloads/W_XX-EUMETSAT-Darmstadt,IMG+SAT,MTI1+FCI-1C-RRAD-FDHSI-FD--x-x---x_C_EUMT_20250120120254_IDPFI_OPE_20250120120007_20250120120924_N__O_0073_0000'
# find files and assign the FCI reader
files = find_files_and_readers(base_dir=path_to_data, reader='fci_l1c_nc')
# create an FCI scene from the selected files
scn = Scene(filenames=files)
# print available dataset names for this scene (e.g. 'vis_04', 'vis_05','ir_38',...)
print(scn.available_dataset_names())
# print available composite names for this scene (e.g. 'natural_color', 'airmass', 'convection',...)
print(scn.available_composite_names())
# load the datasets/composites of interest
scn.load(['natural_color','vis_04'], upper_right_corner='NE')
# note: the data inside the FCI files is stored upside down. The upper_right_corner='NE' argument
# flips it automatically in upright position.
# you can access the values of a dataset as a Numpy array with
vis_04_values = scn['vis_04'].values
# resample the scene to a specified area (e.g. "eurol1" for Europe in 1km resolution)
scn_resampled = scn.resample("eurol", resampler='nearest', radius_of_influence=5000)
# save the resampled dataset/composite to disk
scn_resampled.save_dataset("natural_color", filename='./fci_natural_color_resampled.png')
Expected behavior I expect a successful run of the script producing a png with natural colors europe.
Actual results
Traceback (most recent call last):
File "/Users/mjodas/workspace/eea/scripts/meteusat_satpy_example.py", line 29, in <module>
vis_04_values = scn['vis_04'].values
^^^^^^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/xarray/core/dataarray.py", line 814, in values
return self.variable.values
^^^^^^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/xarray/core/variable.py", line 565, in values
return _as_array_or_item(self._data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/xarray/core/variable.py", line 362, in _as_array_or_item
data = np.asarray(data)
^^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/dask/array/core.py", line 1709, in __array__
x = self.compute()
^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/dask/base.py", line 372, in compute
(result,) = compute(self, traverse=False, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/dask/base.py", line 660, in compute
results = schedule(dsk, keys, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "src/netCDF4/_netCDF4.pyx", line 5079, in netCDF4._netCDF4.Variable.__getitem__
File "src/netCDF4/_netCDF4.pyx", line 6051, in netCDF4._netCDF4.Variable._get
File "src/netCDF4/_netCDF4.pyx", line 2164, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: HDF error
sys:1: DeprecationWarning: Call to deprecated function (or staticmethod) _destroy.
OR, I get
segmentation fault /Users/mjodas/workspace/eea/.venv/bin/python
Environment Info:
- OS: macOS Sequoia 15.2
- I have an MacBook M2 with 16 GB ram.
- Satpy Version: 0.54.0
- PyResample Version:
- Readers and writers dependencies (when relevant): [run
from satpy.utils import check_satpy; check_satpy()]
Output from from satpy.utils import check_satpy: check_satpy():
Readers
abi_l1b: ok
abi_l1b_scmi: ok
abi_l2_nc: ok
acspo: ok
agri_fy4a_l1: ok
agri_fy4b_l1: ok
ahi_hrit: ok
ahi_hsd: ok
ahi_l1b_gridded_bin: ok
ahi_l2_nc: ok
ami_l1b: cannot find module 'satpy.readers.ami_l1b' (No module named 'pyspectral')
amsr2_l1b: ok
amsr2_l2: ok
amsr2_l2_gaasp: ok
amsub_l1c_aapp: ok
ascat_l2_soilmoisture_bufr: cannot find module 'satpy.readers.ascat_l2_soilmoisture_bufr' (('Missing eccodes-python and/or eccodes C-library installation. Use conda to install eccodes.\n Error: ', ModuleNotFoundError("No module named 'eccodes'")))
atms_l1b_nc: ok
atms_sdr_hdf5: ok
avhrr_l1b_aapp: ok
avhrr_l1b_eps: ok
avhrr_l1b_gaclac: cannot find module 'satpy.readers.avhrr_l1b_gaclac' (No module named 'pygac')
avhrr_l1b_hrpt: cannot find module 'satpy.readers.hrpt' (No module named 'geotiepoints')
avhrr_l1c_eum_gac_fdr_nc: ok
aws1_mwr_l1b_nc: ok
aws1_mwr_l1c_nc: ok
caliop_l2_cloud: cannot find module 'satpy.readers.caliop_l2_cloud' (cannot import name 'Dataset' from 'satpy.dataset' (/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/satpy/dataset/__init__.py))
clavrx: ok
cmsaf-claas2_l2_nc: ok
electrol_hrit: ok
epic_l1b_h5: ok
eps_sterna_mwr_l1b_nc: ok
fci_l1c_nc: ok
fci_l2_bufr: cannot find module 'satpy.readers.eum_l2_bufr' (Missing eccodes-python and/or eccodes C-library installation. Use conda to install eccodes)
fci_l2_grib: cannot find module 'satpy.readers.eum_l2_grib' (Missing eccodes-python and/or eccodes C-library installation. Use conda to install eccodes)
fci_l2_nc: ok
fy3a_mersi1_l1b: cannot find module 'satpy.readers.mersi_l1b' (No module named 'pyspectral')
fy3b_mersi1_l1b: cannot find module 'satpy.readers.mersi_l1b' (No module named 'pyspectral')
fy3c_mersi1_l1b: cannot find module 'satpy.readers.mersi_l1b' (No module named 'pyspectral')
generic_image: cannot find module 'satpy.readers.generic_image' (No module named 'rasterio')
geocat: ok
gerb_l2_hr_h5: ok
ghi_l1: ok
ghrsst_l2: ok
gld360_ualf2: ok
glm_l2: ok
gms5-vissr_l1b: cannot find module 'satpy.readers.gms.gms5_vissr_l1b' (No module named 'numba')
goci2_l2_nc: ok
goes-imager_hrit: ok
goes-imager_nc: ok
gpm_imerg: ok
grib: cannot find module 'satpy.readers.grib' (No module named 'pygrib')
hsaf_grib: cannot find module 'satpy.readers.hsaf_grib' (No module named 'pygrib')
hsaf_h5: ok
hy2_scat_l2b_h5: ok
iasi_l2: ok
iasi_l2_cdr_nc: ok
iasi_l2_so2_bufr: cannot find module 'satpy.readers.iasi_l2_so2_bufr' (('Missing eccodes-python and/or eccodes C-library installation. Use conda to install eccodes.\n Error: ', ModuleNotFoundError("No module named 'eccodes'")))
ici_l1b_nc: cannot find module 'satpy.readers.ici_l1b_nc' (No module named 'geotiepoints')
insat3d_img_l1b_h5: ok
jami_hrit: ok
li_l2_nc: ok
maia: ok
mcd12q1: ok
meris_nc_sen3: ok
mersi2_l1b: cannot find module 'satpy.readers.mersi_l1b' (No module named 'pyspectral')
mersi3_l1b: cannot find module 'satpy.readers.mersi_l1b' (No module named 'pyspectral')
mersi_ll_l1b: cannot find module 'satpy.readers.mersi_l1b' (No module named 'pyspectral')
mersi_rm_l1b: cannot find module 'satpy.readers.mersi_l1b' (No module named 'pyspectral')
mhs_l1c_aapp: ok
mimicTPW2_comp: ok
mirs: ok
modis_l1b: ok
modis_l2: ok
modis_l3: ok
msi_safe: ok
msi_safe_l2a: ok
msu_gsa_l1b: ok
mtsat2-imager_hrit: ok
mviri_l1b_fiduceo_nc: ok
mwi_l1b_nc: cannot find module 'satpy.readers.ici_l1b_nc' (No module named 'geotiepoints')
mws_l1b_nc: ok
nucaps: ok
nwcsaf-geo: ok
nwcsaf-msg2013-hdf5: ok
nwcsaf-pps_nc: ok
oceancolorcci_l3_nc: ok
oci_l2_bgc: ok
olci_l1b: ok
olci_l2: ok
oli_tirs_l1_tif: ok
omps_edr: ok
osisaf_nc: ok
safe_sar_l2_ocn: ok
sar-c_safe: cannot find module 'satpy.readers.sar_c_safe' (No module named 'rasterio')
satpy_cf_nc: ok
scatsat1_l2b: cannot find module 'satpy.readers.scatsat1_l2b' (cannot import name 'Dataset' from 'satpy.dataset' (/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/satpy/dataset/__init__.py))
seadas_l2: ok
seviri_l1b_hrit: ok
seviri_l1b_icare: ok
seviri_l1b_native: ok
seviri_l1b_nc: ok
seviri_l2_bufr: cannot find module 'satpy.readers.eum_l2_bufr' (Missing eccodes-python and/or eccodes C-library installation. Use conda to install eccodes)
seviri_l2_grib: cannot find module 'satpy.readers.eum_l2_grib' (Missing eccodes-python and/or eccodes C-library installation. Use conda to install eccodes)
sgli_l1b: ok
slstr_l1b: ok
smos_l2_wind: ok
tropomi_l2: ok
vii_l1b_nc: cannot find module 'satpy.readers.vii_l1b_nc' (No module named 'geotiepoints')
vii_l2_nc: cannot find module 'satpy.readers.vii_l2_nc' (No module named 'geotiepoints')
viirs_compact: ok
viirs_edr: ok
viirs_edr_active_fires: ok
viirs_edr_flood: ok
viirs_l1b: ok
viirs_l2: ok
viirs_sdr: ok
viirs_vgac_l1c_nc: ok
virr_l1b: cannot find module 'satpy.readers.virr_l1b' (No module named 'pyspectral')
Writers
awips_tiled: ok
cf: ok
geotiff: cannot find module 'satpy.writers.geotiff' (No module named 'rasterio')
mitiff: ok
ninjogeotiff: cannot find module 'satpy.writers.ninjogeotiff' (No module named 'rasterio')
ninjotiff: cannot find module 'satpy.writers.ninjotiff' (No module named 'pyninjotiff')
simple_image: ok
Versions
platform: macOS-15.2-arm64-arm-64bit
python: 3.12.8
cartopy: 0.24.1
dask: 2024.12.1
fsspec: 2024.12.0
gdal: not installed
geoviews: not installed
h5netcdf: not installed
h5py: 3.12.1
netcdf4: 1.7.2
numpy: 2.2.2
pyhdf: 0.11.6
pyproj: 3.7.0
rasterio: not installed
xarray: 2025.1.1
Additional context
I have played around with the script a bit and one thing I noticed was that if I remove the line that access the values in the data it seem to successfully resample the data but it then fails on the saving. See error message for that below
Traceback (most recent call last):
File "/Users/mjodas/workspace/eea/scripts/meteusat_satpy_example.py", line 36, in <module>
scn_resampled.save_dataset("natural_color", filename='./fci_natural_color_resampled.png')
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/satpy/scene.py", line 1234, in save_dataset
return writer.save_dataset(self[dataset_id],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/satpy/writers/__init__.py", line 885, in save_dataset
return self.save_image(img, filename=filename, compute=compute, fill_value=fill_value, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/satpy/writers/simple_image.py", line 67, in save_image
return img.save(filename, compute=compute, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/trollimage/xrimage.py", line 248, in save
return self.pil_save(filename, fformat, fill_value,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/trollimage/xrimage.py", line 464, in pil_save
return delay.compute()
^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/dask/base.py", line 372, in compute
(result,) = compute(self, traverse=False, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mjodas/workspace/eea/.venv/lib/python3.12/site-packages/dask/base.py", line 660, in compute
results = schedule(dsk, keys, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "src/netCDF4/_netCDF4.pyx", line 5079, in netCDF4._netCDF4.Variable.__getitem__
File "src/netCDF4/_netCDF4.pyx", line 6051, in netCDF4._netCDF4.Variable._get
File "src/netCDF4/_netCDF4.pyx", line 2164, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: HDF error
sys:1: DeprecationWarning: Call to deprecated function (or staticmethod) _destroy.
We've heard of others having similar issues when working with PyPI installed packages recently. From what I remember it comes down to an incompatible version of HDF5 or NetCDF4 (the C libraries) and the h5py and netcdf4 python packages that are installing "bundled" versions of these libraries. I see two possible workarounds until those libraries clear things up:
- Use conda to get packages from conda-forge where compatibility is more guaranteed.
- Try limiting versions to
netcdf4==1.7.1.post2 h5py==3.11.0. If that still doesn't work, try creating a new environment and in addition to these versions also limit numpy tonumpy==1.26.4.
Let us know how it goes.
Thank you for the fast response! I first tried alternative 1, played around with new environments and the specific versions that you provided but I could not get it to work.
Then, I went for the conda-way and everything worked flawlessly on the first try :)
For me conda is fine so if you want you can resolve this bug issue, but I guess others will have this issue as well.
Thanks you the help!
Yeah I think all we can do is wait for a new h5py release that bundles a "better" HDF5. I think the HDF5 group, or maybe conda-forge's build of HDF5, just did a patch that changes this behavior which is causing even more confusion. I'll close this for now, but bottom line for any users running in to this in the next month or two is to use conda with conda-forge packages and it should work.
After a lot of fruitless hours attempting to debug this, all I can say is it is still an issue as of today.
I think if this is still an issue with the newest versions of releases we (people being affected by this) with a minimal example that does not use Satpy at all.
I'm getting
sys:1: DeprecationWarning: Call to deprecated function (or staticmethod) _destroy.
when running pytest with xonsh v0.19.4 installed. Uninstalling xonsh fixes the problem. Reinstalling xonsh reproduces it.
Everything is installed from conda-forge.
Do other dependencies change when uninstalling/installing xonsh? And you're talking the shell program xonsh, right? But...huh?
These are the only dependencies that are added or removed. Yes it's the xonsh shell. It has a pytest plugin. Totally bizarre.
- xonsh 0.19.4 py312h7900ff3_0 conda-forge Cached
- pyperclip 1.9.0 pyha804496_3 conda-forge Cached
- xsel 1.2.1 hb9d3cd8_6 conda-forge Cached
- xclip 0.13 hb9d3cd8_4 conda-forge Cached
- xorg-libxmu 1.2.1 hb9d3cd8_1 conda-forge Cached
- xorg-libxt 1.3.1 hb9d3cd8_0 conda-forge Cached
- xorg-libsm 1.2.6 he73a12e_0 conda-forge Cached
- xorg-libice 1.1.2 hb9d3cd8_0 conda-forge Cached
Despite this being a climate project, I don't actually have satpy installed in my environment. I was searching for the warning, and I recognized satpy, and thought this must be relevant. I'm only posting here because the warning matches, but clearly this doesn't have to do with satpy.
In VS Code / Cursor, the warning manifests as an error and blocks pytest discovery:
2025-05-27 19:09:52.070 [error] Error discovering pytest tests:
n [Error]: sys:1: DeprecationWarning: Call to deprecated function (or staticmethod) _destroy.
I just tried and failed to make a minimal reproducer in a Docker container, and unfortunately don't have any more time to throw at this. My solution is to uninstall xonsh.
I spent a bit of time working on this today - definitely still an issue in 0.57.0/latest when trying to process fci_l1c_nc.
For Debian Bookworm
Breaking Repro:
satpy[geotiff,rayleigh]==0.57.0
netcdf4==1.7.2
Working:
satpy[geotiff,rayleigh]==0.57.0
netcdf4==1.7.1.post2
For Ubuntu 24.04 LTS
Breaking Repro:
satpy[geotiff,rayleigh]==0.57.0
netcdf4==1.7.1.post2
(However, numpy flags there may be a version mismatch between numpy and netcdf)
Working:
satpy[geotiff,rayleigh]==0.57.0
netcdf4==1.6.5
numpy==1.26.4
I reproduced this cleanly in a container with fresh builds. Essentially, whenever python package netcdf4 1.7.2 comes to be installed I got a segfault. For instance installing satpy[geotiff,abi_l1b,viirs_l1b,rayleigh]==0.57.0 will also cause 1.7.2 to be installed.
My container is pretty much Debian bookworm slim, with python3, build-essential and gdal-bin installed. I then install satpy with the above as requirements.txt. Using system python packages vs venv didn't have any impact. I couldn't see xonsh becoming installed like other commenters.
OP @MachineMoose has netcdf4 1.7.2 installed and confirming what @djhoese said, but I didn't need to pin anything else (I tried these initially and refined back a minimum viable). This was a big help but took a while to confirm.
I then got stuck getting thing to work back in my dev box (it was previously happy before I screwed around) and realised the underlying cause is most likely a mismatch between the system libnetcdf[-dev] or libhdf5 and the python package netcdf4.
To be clear, this is not a problem with Satpy. We just happen to depend on and use some weird combination of packages. I think the main problem here is netcdf4 + h5py coming from PyPI and likely being built (and bundling/installing) a different version of HDF5. The other problem package could be hdf5plugin.
If I start with a conda environment:
Remove some dependencies:
conda uninstall --force netcdf4 libnetcdf h5py hdf5 hdf4 hdf5plugin
then install the necessary missing packages:
pip install h5py netcdf4
then run:
from glob import glob
from satpy.scene import Scene
from satpy.utils import debug_on
scn = Scene(reader="fci_l1c_nc", filenames=glob("/data/satellite/fci/RC0072/*.nc"))
scn.load(["vis_04"])
print(scn['vis_04'].values[0, 0])
I get a segmentation fault. Running the script preceded with strace python ... then it looks like a bunch of "futex" calls at the low level:
mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f3ca1645000
mprotect(0x7f3ca1646000, 8388608, PROT_READ|PROT_WRITE) = 0
rt_sigprocmask(SIG_BLOCK, ~[], [], 8) = 0
clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7f3ca1e45910, parent_tid=0x7f3ca1e45910, exit_signal=0, stack=0x7f3ca1645000, stack_size=0x7fff00, tls=0x7f3ca1e45640} => {parent_tid=[2559927]}, 88) = 2559927
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
futex(0x58da3a62ec00, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x58da3a62ec08, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffd6cb20220, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0
futex(0x58da3a62ec00, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=1183233, tv_nsec=406594632}, FUTEX_BITSET_MATCH_ANY) = ?
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
That doesn't tell us much, but yeah that's where my seg fault is happening. Here are the PyPI installed packages I used:
$ conda list h5
# packages in environment at /home/davidh/miniforge3/envs/fci_debug:
#
# Name Version Build Channel
h5py 3.14.0 pypi_0 pypi
$ conda list net
# packages in environment at /home/davidh/miniforge3/envs/fci_debug:
#
# Name Version Build Channel
netcdf4 1.7.2 pypi_0 pypi
So we need to take this information and reproduce what Saty is doing but without Satpy.
You can see that both h5py and netCDF4 (the python libraries) install (come bundled with) separate versions of libhdf5:
$ ls -1 ~/miniforge3/envs/fci_debug/lib/python3.13/site-packages/h5py.libs
libaec-7e9d22b8.so.0.1.3
libhdf5-8c29085d.so.310.5.1
libhdf5_hl-a45b8ce6.so.310.0.6
libsz-b97e7bd9.so.2.0.1
$ ls -1 ~/miniforge3/envs/fci_debug/lib/python3.13/site-packages/netCDF4.libs/
libaec-001fb5f0.so.0.0.12
libcrypto-7323a46d.so.3
libcurl-5a2014d4.so.4.8.0
libhdf5-0b47eb58.so.310.2.0
libhdf5_hl-123198ff.so.310.0.2
libnetcdf-1423d252.so.22
libssl-f6dcfdae.so.3
libsz-b66d1717.so.2.0.1
>>> import h5py
>>> h5py.version.api_version
'1.8'
>>> h5py.version.hdf5_version
'1.14.6'
>>> import netCDF4
>>> netCDF4._netCDF4.__version__
'1.7.2'
>>> netCDF4._netCDF4.getlibversion()
'4.9.4-development of Oct 7 2024 08:34:05 $'
>>> netCDF4._netCDF4._gethdf5libversion()
'1.14.2'
>>>
To prove that it is the bundled HDF5 that's the problem:
$ pip uninstall -y netcdf4 h5py
$ conda install -y libnetcdf hdf5
$ pip install --no-binary netcdf4 --no-binary h5py netcdf4 h5py
$ python debug_fci_netcdf.py
Don't know how to open the following files: {'/data/satellite/fci/RC0072/W_XX-EUMETSAT-Darmstadt,IMG+SAT,MTI1+FCI-1C-RRAD-FDHSI-FD--CHK-TRAIL---NC4E_C_EUMT_20170920120422_GTT_DEV_20170920115008_20170920115922_N__T_0072_0041.nc'}
nan
That is, works just fine.
I was able to uninstall h5py and still get this segmentation fault.
However, I've tried reproducing this with just netcdf4 (from netCDF4 import Dataset) and with just xarray (with and without dask used) and have it access one variable's effective_radiance but I'm unable to reproduce the segmentation fault I get when using Satpy. My guess is it takes some combination of the calibration and attributes/variables being loaded to trigger the problem. Or maybe a combination of xarray and netcdf being used for opening/reading the file (I'm not sure if the FCI reader does this).
I'll try a little longer, but as someone who doesn't use the FCI reader and isn't familiar with the low-level details I think I've hit my limit on the amount of time I can spend on this for the time being. The bottom line is there is something wrong with the bundled HDF5 and NetCDF4 in the netcdf4-python package from PyPI.
Whoa, OK, got an interesting result. I tried reproducing without using Satpy's Scene, but was annoyed at how hard it would be to call/initialize the file handler by itself. So I went the higher-level interface way and used load_readers and got this:
def with_hacked_satpy2():
from satpy.readers.core.loading import load_readers, load_reader
readers = load_readers(filenames=sorted(glob("/data/satellite/fci/RC0072/*BODY*.nc"))[:5], reader="fci_l1c_nc")
r = readers["fci_l1c_nc"]
datasets = r.load(["vis_04"])
print(datasets["vis_04"].values[0, 0])
If I load 1 file, it's fine. Sometimes, if I load 2 files it is also fine (use [:2] after the glob), other times seg fault. If I load 5 files like above then the first time I ran it I got:
File "/home/davidh/miniforge3/envs/fci_debug/lib/python3.13/site-packages/dask/base.py", line 681, in compute
results = schedule(expr, keys, **kwargs)
File "src/netCDF4/_netCDF4.pyx", line 5079, in netCDF4._netCDF4.Variable.__getitem__
File "src/netCDF4/_netCDF4.pyx", line 6051, in netCDF4._netCDF4.Variable._get
File "src/netCDF4/_netCDF4.pyx", line 2164, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: HDF error
Edit: One of the instances with 5 files I got a successful run, but 99% of the time it is a seg fault.
Edit 2: Tried similar multi-file things of a single variable (no calibration or anything) with netCDF4 and with xarray-dask and couldn't get them to seg fault or error.
Very interesting. That last result (NetCDF: HDF error) was what I would occasionally get, at least when it gracefully errored. I might poke around some more.
What doesn't obviously fit for me was there doesn't seem to be a direct link between a specific version of a package being installed (e.g. either netcdf4 and its shared objects and h5py) and the seg faults behaviour (suggesting legitimate binary incompatibilities). Like you can't just install a particular version and its broken. I have two fully functioning boxes, but both need different dep versions to work?! And I cannot use the other set of deps on the other box.
Is there a codepath where system libs somehow come into the loading mix, perhaps by satpy? I don't know enough of the lib to know what's depended on.
I wonder if the noise is two different issues, one with legitimate binary incompatibility and one with something like concurrent libnetcdf file access using the same instance/memory space.
From valgrind, likely a use-after-free when concurrently reading multiple nc files (? assuming this is how it's read from satpy)
valgrind --track-origins=yes <cmd>
==238210== Thread 65:
==238210== Invalid read of size 8
==238210== at 0xB6DCF96F: H5P_get (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C64263: H5CX_get_err_detect (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C796D9: H5D__chunk_lock.constprop.0 (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C7B8C9: H5D__chunk_read (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C97EE5: H5D__read (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6ED3251: H5VL__native_dataset_read (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6EBEAA1: H5VL_dataset_read_direct (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C664AA: H5D__read_api_common.constprop.0 (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C691C1: H5Dread (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB695D8FC: NC4_get_vars (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libnetcdf-1423d252.so.22)
==238210== by 0xB695DF10: NC4_get_vara (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libnetcdf-1423d252.so.22)
==238210== by 0xB69215FD: NC_get_vara (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libnetcdf-1423d252.so.22)
==238210== Address 0x103579148 is 8 bytes after a block of size 48 alloc'd
==238210== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==238210== by 0xB6D084FE: H5FL__malloc (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6D0862C: H5FL_reg_malloc (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6D0871E: H5FL_reg_calloc (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6DD09C0: H5P_copy_plist (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6D215D5: H5G_get_create_plist (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6ED5F5C: H5VL__native_group_get (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6EC4B63: H5VL_group_get (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6D15152: H5Gget_create_plist (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB695798F: rec_read_metadata (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libnetcdf-1423d252.so.22)
==238210== by 0xB6957A51: rec_read_metadata (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libnetcdf-1423d252.so.22)
==238210== by 0xB695AB4A: NC4_open (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libnetcdf-1423d252.so.22)
==238210==
==238210== Invalid read of size 4
==238210== at 0xB6E1B36E: H5SL_search (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6DCF97A: H5P_get (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C64263: H5CX_get_err_detect (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C796D9: H5D__chunk_lock.constprop.0 (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C7B8C9: H5D__chunk_read (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C97EE5: H5D__read (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6ED3251: H5VL__native_dataset_read (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6EBEAA1: H5VL_dataset_read_direct (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C664AA: H5D__read_api_common.constprop.0 (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C691C1: H5Dread (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB695D8FC: NC4_get_vars (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libnetcdf-1423d252.so.22)
==238210== by 0xB695DF10: NC4_get_vara (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libnetcdf-1423d252.so.22)
==238210== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==238210==
==238210==
==238210== Process terminating with default action of signal 11 (SIGSEGV)
==238210== Access not within mapped region at address 0x0
==238210== at 0xB6E1B36E: H5SL_search (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6DCF97A: H5P_get (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C64263: H5CX_get_err_detect (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C796D9: H5D__chunk_lock.constprop.0 (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C7B8C9: H5D__chunk_read (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C97EE5: H5D__read (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6ED3251: H5VL__native_dataset_read (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6EBEAA1: H5VL_dataset_read_direct (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C664AA: H5D__read_api_common.constprop.0 (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB6C691C1: H5Dread (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libhdf5-0b47eb58.so.310.2.0)
==238210== by 0xB695D8FC: NC4_get_vars (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libnetcdf-1423d252.so.22)
==238210== by 0xB695DF10: NC4_get_vara (in /home/tristan/sat/resources/venv/lib/python3.12/site-packages/netCDF4.libs/libnetcdf-1423d252.so.22)
==238210== If you believe this happened as a result of a stack
==238210== overflow in your program's main thread (unlikely but
==238210== possible), you can try to increase the size of the
==238210== main thread stack using the --main-stacksize= flag.
==238210== The main thread stack size used in this run was 8388608.
==238210==
==238210== HEAP SUMMARY:
==238210== in use at exit: 1,332,844,815 bytes in 5,523,813 blocks
==238210== total heap usage: 11,123,988 allocs, 5,600,175 frees, 2,919,069,246 bytes allocated
==238210==
==238210== LEAK SUMMARY:
==238210== definitely lost: 128 bytes in 1 blocks
==238210== indirectly lost: 0 bytes in 0 blocks
==238210== possibly lost: 1,461,162 bytes in 521 blocks
==238210== still reachable: 1,331,383,525 bytes in 5,523,291 blocks
==238210== of which reachable via heuristic:
==238210== stdstring : 694,205 bytes in 19,448 blocks
==238210== length64 : 150,767,442 bytes in 375,990 blocks
==238210== newarray : 3,246,176 bytes in 197,763 blocks
==238210== suppressed: 0 bytes in 0 blocks
==238210== Rerun with --leak-check=full to see details of leaked memory
==238210==
==238210== For lists of detected and suppressed errors, rerun with: -s
==238210== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
zsh: segmentation fault (core dumped)
Is there a codepath where system libs somehow come into the loading mix, perhaps by satpy? I don't know enough of the lib to know what's depended on.
It depends. There is nothing in Satpy specifically that is doing something with the C libraries or something like that. I also think I was wrong and h5py may not be part of this problem as I've been getting failures with h5py removed. It all depends on how the library was built. If a platform has a corresponding wheel on PyPI then pip will by default use those binary wheels. In those cases the python code and it's C extensions will link to the bundled library. If there is no wheel available for a particular architecture or you specify --no-binary <package, then pip will usually build the C extensions of the python package at install time and try to find whatever C library it can (but this is can be customized by the package install scripts - ex. setup.py). So if you're not in a venv or conda env with libnetcdf installed then system installs will also be searched.
There are ways to break the above by doing things like defining LD_LIBRARY_PATH at runtime (or LD_RUN_PATH at binary building time), at least on unix-y systems.
Now, Satpy, especially the FCI reader, does do some "fun" things with holding on to a file handle or caching netcdf4-python objects and other such stuff. There are a couple calls to closing the file handle in the utility satpy/readers/core/netcdf.py (in modern satpy), but even commenting those out didn't change things.
This issue came up again when talking to a colleague (@levidpc). This is definitely a threading issue and can be "worked around" by telling dask to use a single thread. In some of my simple tests I was able to get no seg faults or errors by doing any of the following:
DASK_SCHEDULER="single-threaded" python my_script.py
DASK_SCHEDULER="synchronous" python my_script.py
DASK_NUM_WORKERS="1" python my_script.py
These same settings can be set inside python before computing dask arrays by doing dask.config.set(scheduler="single-threaded"). For num workers it is .set(num_workers=1) I believe (an integer, not a string).
I was also reminded in a similar issue with the latest conda-forge build of libnetcdf (https://github.com/conda-forge/libnetcdf-feedstock/issues/215) that NetCDF C is not thread safe. However, I was under the impression that xarray tried to lock around some of those issues. Also note that the conda-forge building I'm having issues with does not fail with my test case where the PyPI version fails.
So a long term solution is still needed and hopefully that doesn't mean waiting for NetCDF C to make a thread-safe implementation.
Also, for my own future reference, when I tell the Satpy filehandler to not cache the file handle (the netcdf4 Dataset instance) and instead I reopen it for every variable/attribute access (bad for performance) then it also succeeds even with multiple threaded workers.
I am getting Segmentation Fault when trying the MetImage reader on the Eumetsat sample data (#3277) and have ended up here when trying to find the cause.
I'm using conda with the latest packages
h5py 3.15.1 nompi_py314hc32fe06_100 conda-forge
hdf4 4.2.15 h2a13503_7 conda-forge
hdf5 1.14.6 nompi_h6e4c0c1_103 conda-forge
libnetcdf 4.9.3 nompi_h11f7409_103 conda-forge
netcdf4 1.7.3 nompi_py314hed328fd_100 conda-forge
I've tried these, none of which help:
ulimit -n 4096
export DASK_NUM_WORKERS=1
export OMP_NUM_THREADS=1
export DASK_SCHEDULER="single-threaded"
I'm not sure what else to try?
@howff Very odd. I have a feeling that's specific to something about that reader or file format or the libraries being used. Or at least I really hope so as you're doing all of the possible workarounds to make it work. Or maybe this is just revealing that the VII reader is doing more of the things the FCI reader is doing that is causing the seg faults/errors of the original issue.
Writing to inform that this is still an issue. I managed to get it working, after hours of debugging, with this configuration:
python 3.13.7 (note: python 3.12 was giving the same issue)
netcdf4==1.7.1 (netcdf4==1.7.2 was still a problem)
h5py==3.14.0
The venv was created via uv, so no conda-forge channels or anything, only pypi packages.