xarray
xarray copied to clipboard
🐛 NetCDF4 RuntimeWarning if xarray is imported before netCDF4
What happened?
Yesterday we got a dependabot update PR to upgrade xarray
from 2022.10.0
to 2022.11.0
and a test where we check for our own deprecation warnings failed because there was an additional unexpected warning.
After some debugging we found that the warning was caused by calling xarray.Dataset.to_netcdf
for the first time in our test suite, but did not trigger when calling it again.
After a lot of head-scratching and confusion, we found that it is an import order problem that can be solved by importing netCDF4 before importing xarray (we didn't import netCDF4 at all in our code).
What did you expect to happen?
No RuntimeWarning from netCDF4
Minimal Complete Verifiable Example
import xarray
import warnings
warnings.filterwarnings('error')
import netCDF4
MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
Traceback (most recent call last):
File "d:\git\pyglotaran\glotaran\builtin\io\folder\test\test_folder_plugin.py", line 86, in <module>
save_result(result_path="foo", format_name="folder", result=result)
File "D:\git\pyglotaran\glotaran\plugin_system\io_plugin_utils.py", line 87, in wrapper
return func(*args, **kwargs)
File "D:\git\pyglotaran\glotaran\plugin_system\project_io_registration.py", line 473, in save_result
paths = io.save_result( # type: ignore[call-arg]
File "D:\git\pyglotaran\glotaran\builtin\io\folder\folder_plugin.py", line 192, in save_result
save_dataset(
File "D:\git\pyglotaran\glotaran\plugin_system\io_plugin_utils.py", line 87, in wrapper
return func(*args, **kwargs)
File "D:\git\pyglotaran\glotaran\plugin_system\data_io_registration.py", line 242, in save_dataset
io.save_dataset( # type: ignore[call-arg]
File "D:\git\pyglotaran\glotaran\builtin\io\netCDF\netCDF.py", line 24, in save_dataset
data_to_save.to_netcdf(file_name, mode="w")
File "C:\Anaconda3\envs\pyglotaran310\lib\site-packages\xarray\core\dataset.py", line 1903, in to_netcdf
return to_netcdf( # type: ignore # mypy cannot resolve the overloads:(
File "C:\Anaconda3\envs\pyglotaran310\lib\site-packages\xarray\backends\api.py", line 1176, in to_netcdf
engine = _get_default_engine(path_or_file)
File "C:\Anaconda3\envs\pyglotaran310\lib\site-packages\xarray\backends\api.py", line 140, in _get_default_engine
return _get_default_engine_netcdf()
File "C:\Anaconda3\envs\pyglotaran310\lib\site-packages\xarray\backends\api.py", line 118, in _get_default_engine_netcdf
import netCDF4 # noqa: F401
File "C:\Anaconda3\envs\pyglotaran310\lib\site-packages\netCDF4\__init__.py", line 3, in <module>
from ._netCDF4 import *
File "src\netCDF4\_netCDF4.pyx", line 1, in init netCDF4._netCDF4
RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject
Anything else we need to know?
The problem can be reproduced by running
python -c "import xarray;import warnings;warnings.filterwarnings('error');import netCDF4"
which throws the error
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Anaconda3\envs\xarray\lib\site-packages\netCDF4\__init__.py", line 3, in <module>
from ._netCDF4 import *
File "src\netCDF4\_netCDF4.pyx", line 1, in init netCDF4._netCDF4
RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject
When importing netCDF4 first all runs as expected
python -c "import netCDF4;import xarray;import warnings;warnings.filterwarnings('error');import netCDF4"
A git bisect shows that the first commit with this problem was https://github.com/pydata/xarray/commit/f32d354e295c05fb5c5ece7862f77f19d82d5894
$ git bisect start
status: waiting for both good and bad commits
(xarray) /d/git/xarray (main|BISECTING)
$ git bisect good v2022.10.0
status: waiting for bad commit, 1 good commit known
(xarray) /d/git/xarray (main|BISECTING)
$ git bisect bad v2022.11.0
Bisecting: 23 revisions left to test after this (roughly 5 steps)
[4944b9eb1483c1fbd0e86fd12f3fb894b325fb8d] Fix binning when labels are provided. (#7205)
(xarray) /d/git/xarray ((4944b9eb...)|BISECTING)
$ git bisect run python -c "import xarray;import warnings;warnings.filterwarnings('error');import netCDF4"
running 'python' '-c' 'import xarray;import warnings;warnings.filterwarnings('\''error'\'');import netCDF4'
Bisecting: 11 revisions left to test after this (roughly 4 steps)
[f32d354e295c05fb5c5ece7862f77f19d82d5894] Lazy Imports (#7179)
running 'python' '-c' 'import xarray;import warnings;warnings.filterwarnings('\''error'\'');import netCDF4'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Anaconda3\envs\xarray\lib\site-packages\netCDF4\__init__.py", line 3, in <module>
from ._netCDF4 import *
File "src\netCDF4\_netCDF4.pyx", line 1, in init netCDF4._netCDF4
RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject
Bisecting: 5 revisions left to test after this (roughly 3 steps)
[b9aedd0155548ed0f34506ecc255b1688f07ffaa] set_coords docs: see also Dataset.assign_coords (#7230)
running 'python' '-c' 'import xarray;import warnings;warnings.filterwarnings('\''error'\'');import netCDF4'
Bisecting: 2 revisions left to test after this (roughly 2 steps)
[65bfa4d10a529f00a9f9b145d1cea402bdae83d0] Actually make the fast code path return early for Aligner.align (#7222)
running 'python' '-c' 'import xarray;import warnings;warnings.filterwarnings('\''error'\'');import netCDF4'
Bisecting: 0 revisions left to test after this (roughly 1 step)
[fc9026b59d38146a21769cc2d3026a12d58af059] Avoid loading any data for reprs (#7203)
running 'python' '-c' 'import xarray;import warnings;warnings.filterwarnings('\''error'\'');import netCDF4'
f32d354e295c05fb5c5ece7862f77f19d82d5894 is the first bad commit
commit f32d354e295c05fb5c5ece7862f77f19d82d5894
Author: Mick <[email protected]>
Date: Fri Oct 28 18:25:39 2022 +0200
Lazy Imports (#7179)
* fix typing of BackendEntrypoint
* make backends lazy
* make matplotlib lazy and add tests for lazy modules
* make flox lazy
* fix generated docs on windows...
* try fixing test
* make pycompat lazy
* make dask.array lazy
* add import xarray without numpy or pandas benchmark
* improve error reporting in test
* fix import benchmark
* add lazy import to whats-new
* fix lazy import test
* fix typos
* fix windows stuff again
asv_bench/benchmarks/import.py | 12 +-
doc/whats-new.rst | 2 +
xarray/backends/cfgrib_.py | 27 ++--
xarray/backends/common.py | 15 ++-
xarray/backends/h5netcdf_.py | 19 ++-
xarray/backends/netCDF4_.py | 16 +--
xarray/backends/pseudonetcdf_.py | 13 +-
xarray/backends/pydap_.py | 24 ++--
xarray/backends/pynio_.py | 13 +-
xarray/backends/scipy_.py | 12 +-
xarray/backends/zarr.py | 15 +--
xarray/convert.py | 3 +-
xarray/core/_aggregations.py | 247 ++++++++++++++++++++++++++++-------
xarray/core/dataset.py | 3 +-
xarray/core/duck_array_ops.py | 31 +++--
xarray/core/formatting.py | 36 ++---
xarray/core/indexing.py | 6 +-
xarray/core/missing.py | 4 +-
xarray/core/parallel.py | 20 +--
xarray/core/pycompat.py | 20 ++-
xarray/core/utils.py | 19 +++
xarray/core/variable.py | 15 +--
xarray/plot/utils.py | 9 +-
xarray/tests/test_backends.py | 4 +-
xarray/tests/test_computation.py | 4 +-
xarray/tests/test_dask.py | 3 +-
xarray/tests/test_dataset.py | 4 +-
xarray/tests/test_duck_array_ops.py | 4 +-
xarray/tests/test_missing.py | 4 +-
xarray/tests/test_plugins.py | 61 ++++++++-
xarray/tests/test_sparse.py | 4 +-
xarray/tests/test_variable.py | 4 +-
xarray/util/generate_aggregations.py | 13 +-
33 files changed, 445 insertions(+), 241 deletions(-)
bisect found first bad commit
Environment
INSTALLED VERSIONS
commit: None python: 3.10.6 | packaged by conda-forge | (main, Oct 7 2022, 20:14:50) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: AMD64 Family 23 Model 8 Stepping 2, AuthenticAMD byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: ('de_DE', 'cp1252') libhdf5: 1.12.1 libnetcdf: 4.8.1
xarray: 2022.11.0 pandas: 1.5.1 numpy: 1.23.4 scipy: 1.9.3 netCDF4: 1.6.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.6.1 cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 59.8.0 pip: 22.2.2 conda: None pytest: 7.1.3 IPython: 8.5.0 sphinx: 5.2.3 None
It is true that the import behavior had changed in the linked PR (it is now lazy importing netCDF4) but I don't see how this can cause such problems.
Normally this warning indicates that something is wrong with your install, but since you can solve it by importing netcdf4 first, I have no idea what is going wrong ...
We were/are also entirely baffled and have no clue what kind of black magic is going is going on with this 😅
I have run into this problem already several times and the problem was always a somehow broken numpy install. A reinstall usually fixed this for me. But I assume this is part of a CI pipeline or so?
Yeah we first saw it on the CI, which moved it from bad local
install (as pretty much all accepted StackOverflow answers also suggest) to
I have no idea what is going wrong ...
😅
Is it repeatable locally with the exact same package versions installed?
It is also reproducible locally only xarray
makes the difference, for the git bisect I used a fresh env.
It is also reproducible on binder:
It seems that the binder uses conda-forge, which is why i'm commenting here.
It is really strange in the sense that xarray doesn't compile anything.
https://github.com/conda-forge/xarray-feedstock/blob/main/recipe/meta.yaml#L16
So it must be something that gets lazy loaded that triggers things.
Can we find a minimum environment that can reproduce this results? I.e. only numpy, pandas, xarray and netcdf4?
mamba create --name xr numpy pandas xarray netcdf4 --channel conda-forge --override-channels
conda activate xr
python -c "import xarray; import warnings; warnings.filterwarnings('error'); import netCDF4"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/mark/mambaforge/envs/xr/lib/python3.11/site-packages/netCDF4/__init__.py", line 3, in <module>
from ._netCDF4 import *
File "src/netCDF4/_netCDF4.pyx", line 1, in init netCDF4._netCDF4
RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject
`mamba list`
mamba list
# packages in environment at /home/mark/mambaforge/envs/xr:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.18.1 h7f98852_0 conda-forge
ca-certificates 2022.9.24 ha878542_0 conda-forge
cftime 1.6.2 py311h4c7f6c3_1 conda-forge
curl 7.86.0 h2283fc2_1 conda-forge
hdf4 4.2.15 h9772cbc_5 conda-forge
hdf5 1.12.2 nompi_h4df4325_100 conda-forge
icu 70.1 h27087fc_0 conda-forge
jpeg 9e h166bdaf_2 conda-forge
keyutils 1.6.1 h166bdaf_0 conda-forge
krb5 1.19.3 h08a2579_0 conda-forge
ld_impl_linux-64 2.39 hc81fddc_0 conda-forge
libblas 3.9.0 16_linux64_openblas conda-forge
libcblas 3.9.0 16_linux64_openblas conda-forge
libcurl 7.86.0 h2283fc2_1 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgfortran-ng 12.2.0 h69a702a_19 conda-forge
libgfortran5 12.2.0 h337968e_19 conda-forge
libgomp 12.2.0 h65d4601_19 conda-forge
libiconv 1.17 h166bdaf_0 conda-forge
liblapack 3.9.0 16_linux64_openblas conda-forge
libnetcdf 4.8.1 nompi_h261ec11_106 conda-forge
libnghttp2 1.47.0 hff17c54_1 conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge
libsqlite 3.40.0 h753d276_0 conda-forge
libssh2 1.10.0 hf14f497_3 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libxml2 2.10.3 h7463322_0 conda-forge
libzip 1.9.2 hc929e4a_1 conda-forge
libzlib 1.2.13 h166bdaf_4 conda-forge
ncurses 6.3 h27087fc_1 conda-forge
netcdf4 1.6.2 nompi_py311hc6fcf29_100 conda-forge
numpy 1.23.4 py311h7d28db0_1 conda-forge
openssl 3.0.7 h166bdaf_0 conda-forge
packaging 21.3 pyhd8ed1ab_0 conda-forge
pandas 1.5.1 py311h8b32b4d_1 conda-forge
pip 22.3.1 pyhd8ed1ab_0 conda-forge
pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge
python 3.11.0 ha86cf86_0_cpython conda-forge
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python_abi 3.11 2_cp311 conda-forge
pytz 2022.6 pyhd8ed1ab_0 conda-forge
readline 8.1.2 h0f457ee_0 conda-forge
setuptools 65.5.1 pyhd8ed1ab_0 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
tk 8.6.12 h27826a3_0 conda-forge
tzdata 2022f h191b570_0 conda-forge
wheel 0.38.4 pyhd8ed1ab_0 conda-forge
xarray 2022.11.0 pyhd8ed1ab_0 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
zlib 1.2.13 h166bdaf_4 conda-forge
I think it is a numpy thing
mamba create --name np numpy netcdf4 --channel conda-forge --override-channels
conda activate np
python -c "import numpy; import warnings; warnings.filterwarnings('error'); import netCDF4"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/mark/mambaforge/envs/np/lib/python3.11/site-packages/netCDF4/__init__.py", line 3, in <module>
from ._netCDF4 import *
File "src/netCDF4/_netCDF4.pyx", line 1, in init netCDF4._netCDF4
RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject
$ mamba list
# packages in environment at /home/mark/mambaforge/envs/np:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.18.1 h7f98852_0 conda-forge
ca-certificates 2022.9.24 ha878542_0 conda-forge
cftime 1.6.2 py311h4c7f6c3_1 conda-forge
curl 7.86.0 h2283fc2_1 conda-forge
hdf4 4.2.15 h9772cbc_5 conda-forge
hdf5 1.12.2 nompi_h4df4325_100 conda-forge
icu 70.1 h27087fc_0 conda-forge
jpeg 9e h166bdaf_2 conda-forge
keyutils 1.6.1 h166bdaf_0 conda-forge
krb5 1.19.3 h08a2579_0 conda-forge
ld_impl_linux-64 2.39 hc81fddc_0 conda-forge
libblas 3.9.0 16_linux64_openblas conda-forge
libcblas 3.9.0 16_linux64_openblas conda-forge
libcurl 7.86.0 h2283fc2_1 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgfortran-ng 12.2.0 h69a702a_19 conda-forge
libgfortran5 12.2.0 h337968e_19 conda-forge
libgomp 12.2.0 h65d4601_19 conda-forge
libiconv 1.17 h166bdaf_0 conda-forge
liblapack 3.9.0 16_linux64_openblas conda-forge
libnetcdf 4.8.1 nompi_h261ec11_106 conda-forge
libnghttp2 1.47.0 hff17c54_1 conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge
libsqlite 3.40.0 h753d276_0 conda-forge
libssh2 1.10.0 hf14f497_3 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libxml2 2.10.3 h7463322_0 conda-forge
libzip 1.9.2 hc929e4a_1 conda-forge
libzlib 1.2.13 h166bdaf_4 conda-forge
ncurses 6.3 h27087fc_1 conda-forge
netcdf4 1.6.2 nompi_py311hc6fcf29_100 conda-forge
numpy 1.23.4 py311h7d28db0_1 conda-forge
openssl 3.0.7 h166bdaf_0 conda-forge
pip 22.3.1 pyhd8ed1ab_0 conda-forge
python 3.11.0 ha86cf86_0_cpython conda-forge
python_abi 3.11 2_cp311 conda-forge
readline 8.1.2 h0f457ee_0 conda-forge
setuptools 65.5.1 pyhd8ed1ab_0 conda-forge
tk 8.6.12 h27826a3_0 conda-forge
tzdata 2022f h191b570_0 conda-forge
wheel 0.38.4 pyhd8ed1ab_0 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
zlib 1.2.13 h166bdaf_4 conda-forge
So it happens when you import numpy before netCDF? Then I would claim that it is a netCDF problem, haha.
one or the other.
If you are still motivated, you could try figuring out if it happens with older versions of numpy and netcdf as well...
I'm really not sure. It seems to happen with a large swath of versions from my recent search.
Also running from the python REPL, i don't see the warning. which makes me feel like numpy/cython/netcdf4 are trying to suppress the harmless warning.
https://github.com/cython/cython/blob/0.29.x/Cython/Utility/ImportExport.c#L365
I would suggest to open an issue in netcdf4, since it seems to happen when you import numpy before netcdf4.
weirdly enough, this: python -c "import warnings; warnings.filterwarnings('error'); import numpy; import netCDF4"
does not error, haha.
:heavy_plus_sign: 1
❯ pip freeze |grep -Ei "xarray|numpy|netcdf"
netCDF4==1.6.3
numpy==1.23.3
xarray==2023.4.2
❯ python -c "import xarray;import warnings;warnings.filterwarnings('error');import netCDF4"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/z3538708/projects/gitlab.com/sqc-eng/se/core/.venv-core/lib/python3.10/site-packages/netCDF4/__init__.py", line 3, in <module>
from ._netCDF4 import *
File "src/netCDF4/_netCDF4.pyx", line 1, in init netCDF4._netCDF4
RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject
❯ python -c "import netCDF4;import xarray;import warnings;warnings.filterwarnings('error');import netCDF4"
❯
No errors when the order is netCD4
import, before import xarray
Is an MCVE available where the warning shows when xarray is installed, but not when it's not? That would isolate whether this in an xarray issue.
Closing to limit the number of open issues without MCVEs, please feel free to reopen with an MCVE