xarray
xarray copied to clipboard
import xarray causes fatal python crash on windows when h5netcdf and netcdf4 are installed
What happened?
On Windows with python (3.9 and 3.10) the command import xarray
results in a crash of python, if I have the packages netcdf4 and h5netcdf installed.
What did you expect to happen?
I expected that xarray would import normally, without a fatal python error.
Minimal Complete Verifiable Example
# On windows:
pip install xarray
pip install h5netcdf
pip install netcdf4
# This results in a crash
python -c "import xarray"
# The crash does not occur when I first import h5netcdf and then import xarray, so the next line does not result in a crash:
python -c "import h5netcdf;import xarray"
# The crash does not occur on linux.
# The crash does not occur when I have only h5netcdf or netcdf4 installed.
MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
command: python -c "import xarray"
C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\h5py\__init__.py:36: UserWarning: h5py is running against HDF5 1.12.1 when it was built against 1.12.2, this may cause problems
_warn(("h5py is running against HDF5 {0} when it was built against {1}, "
Warning! ***HDF5 library version mismatched error***
The HDF5 header files used to compile this application do not match
the version used by the HDF5 library to which this application is linked.
Data corruption or segmentation faults may occur if the application continues.
This can happen when an application was compiled by one version of HDF5 but
linked with a different version of static or shared HDF5 library.
You should recompile the application or check your shared library related
settings such as 'LD_LIBRARY_PATH'.
You can, at your own risk, disable this warning by setting the environment
variable 'HDF5_DISABLE_VERSION_CHECK' to a value of '1'.
Setting it to 2 or higher will suppress the warning messages totally.
Headers are 1.12.2, library is 1.12.1
SUMMARY OF THE HDF5 CONFIGURATION
=================================
General Information:
-------------------
HDF5 Version: 1.12.1
Configured on: 2022-03-04
Configured by: Ninja
Host system: Windows-10.0.17763
Uname information: Windows
Byte sex: little-endian
Installation point: D:/bld/hdf5_split_1646412547396/_h_env/Library
Compiling Options:
------------------
Build Mode: RELEASE
Debugging Symbols: OFF
Asserts: OFF
Profiling: OFF
Optimization Level: OFF
Linking Options:
----------------
Libraries:
Statically Linked Executables: OFF
LDFLAGS: /machine:x64
H5_LDFLAGS:
AM_LDFLAGS:
Extra libraries: D:/bld/hdf5_split_1646412547396/_h_env/Library/lib/libcurl.lib;D:/bld/hdf5_split_1646412547396/_h_env/Library/lib/libssl.lib;D:/bld/hdf5_split_1646412547396/_h_env/Library/lib/libcrypto.lib
Archiver: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.16.27023/bin/HostX64/x64/lib.exe
Ranlib: :
Languages:
----------
C: YES
C Compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.16.27023/bin/HostX64/x64/cl.exe 19.16.27045.0
CPPFLAGS:
H5_CPPFLAGS:
AM_CPPFLAGS:
CFLAGS: /DWIN32 /D_WINDOWS
H5_CFLAGS: /W3;/wd4100;/wd4706;/wd4127
AM_CFLAGS:
Shared C Library: YES
Static C Library: YES
Fortran: OFF
Fortran Compiler:
Fortran Flags:
H5 Fortran Flags:
AM Fortran Flags:
Shared Fortran Library: YES
Static Fortran Library: YES
C++: ON
C++ Compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.16.27023/bin/HostX64/x64/cl.exe 19.16.27045.0
C++ Flags:
H5 C++ Flags: /W3;/wd4100;/wd4706;/wd4127
AM C++ Flags:
Shared C++ Library: YES
Static C++ Library: YES
JAVA: OFF
JAVA Compiler:
Features:
---------
Parallel HDF5: OFF
Parallel Filtered Dataset Writes:
Large Parallel I/O:
High-level library: ON
Build HDF5 Tests: ON
Build HDF5 Tools: ON
Threadsafety: ON (recursive RW locks: )
Default API mapping: v112
With deprecated public symbols: ON
I/O filters (external): DEFLATE
MPE:
Direct VFD:
Mirror VFD:
(Read-Only) S3 VFD: 1
(Read-Only) HDFS VFD:
dmalloc:
Packages w/ extra debug output:
API Tracing: OFF
Using memory checker: OFF
Memory allocation sanity checks: OFF
Function Stack Tracing: OFF
Use file locking: best-effort
Strict File Format Checks: OFF
Optimization Instrumentation:
Bye...
Error: Process completed with exit code 1.
Anything else we need to know?
This bug is reproduced by the github action runner: https://github.com/daanscheltens/test-netcdf4/actions/runs/3196339371/jobs/5218135577
This action is part of a dedicated empty repository that just contains this action workflow: https://github.com/daanscheltens/test-netcdf4/blob/main/.github/workflows/action.yml
Environment
python -c "import h5netcdf; import xarray as xr;xr.show_versions()"
xarray: 2022.9.0 pandas: 1.5.0 numpy: 1.23.3 scipy: None netCDF4: None pydap: None h5netcdf: 1.0.2 h5py: 3.7.0 Nio: None zarr: None cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 58.1.0 pip: 22.2.2 conda: None pytest: None IPython: None sphinx: None
I suspect that's because you're installing via pip
, where both the h5netcdf
and the netcdf4
wheel bundle the HDF5 library (but different versions, apparently). If that's correct, you should get the same error if you're importing netcdf4
then h5netcdf
:
import netCDF4
import h5netcdf
Note that because there is a separate libhdf5
package for conda
you don't have that issue (and you might want to use micromamba
in CI)
Thanks for the quick reply.
I indeed get the same error when trying
import netcdf4
import h5netcdf
Since the h5py libary is used by h5netcdf, I also tried the next imports, which again give the same error.
import netcdf4
import h5py
Conclusion is that indead the hdf5 library inside netcdf4 is older, and the check in h5py gives the fatal error.
Both netCDF4 and h5netcdf are optional requirements for xarray. Why is netCDF4 then imported when I don't use it for a certain calculation?
Note that conda is not an option for me, as this is incompatible with nessecary third party software.
See https://github.com/pydata/xarray/issues/6726#issuecomment-1257279640.
We are thinking in doing the backend imports only when needed.