Engine parameter ignored when writing NetCDF4 if writing to a file handle.
What happened?
I tried to call ds.to_netcdf(f, engine="netcdf4", format="NETCDF4") and got an
ValueError: invalid format for scipy.io.netcdf backend: 'NETCDF4'
What did you expect to happen?
I expected the write to succeed.
Minimal Complete Verifiable Example
import numpy as np
import xarray as xr
arr_thr = np.random.random_sample((2, 100, 100))
x_var = np.arange(100)
y_var = np.arange(100)
ds = xr.Dataset(
{
"low_threshold": (["x", "y"], arr_thr[0], {"units": "dB"}),
"high_threshold": (["x", "y"], arr_thr[1], {"units": "dB"}),
},
coords={
"x_dist": (["x"], x_var, {"units": "m"}),
"y_dist": (["y"], y_var, {"units": "m"}),
},
)
with open("/tmp/test.nc", "wb") as f:
ds.to_netcdf(f, engine="netcdf4", format="NETCDF4")
MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?1f881801-9039-4b0c-abe7-f26b42b290b9)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In [9], line 2
1 with open("/tmp/test.nc", "wb") as f:
----> 2 ds.to_netcdf(f, engine="netcdf4", format="NETCDF4")
File ~/mambaforge/envs/eiland-dev/lib/python3.10/site-packages/xarray/core/dataset.py:1882, in Dataset.to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf)
1879 encoding = {}
1880 from ..backends.api import to_netcdf
-> 1882 return to_netcdf( # type: ignore # mypy cannot resolve the overloads:(
1883 self,
1884 path,
1885 mode=mode,
1886 format=format,
1887 group=group,
1888 engine=engine,
1889 encoding=encoding,
1890 unlimited_dims=unlimited_dims,
1891 compute=compute,
1892 multifile=False,
1893 invalid_netcdf=invalid_netcdf,
1894 )
File ~/mambaforge/envs/eiland-dev/lib/python3.10/site-packages/xarray/backends/api.py:1193, in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf)
1189 else:
...
--> 138 raise ValueError(f"invalid format for scipy.io.netcdf backend: {format!r}")
140 if lock is None and mode != "r" and isinstance(filename_or_obj, str):
141 lock = get_write_lock(filename_or_obj)
ValueError: invalid format for scipy.io.netcdf backend: 'NETCDF4'
Anything else we need to know?
No response
Environment
xarray: 2022.6.0 pandas: 1.4.4 numpy: 1.23.2 scipy: 1.9.1 netCDF4: 1.6.0 pydap: None h5netcdf: 1.0.2 h5py: 3.7.0 Nio: None zarr: None cftime: 1.6.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: 2022.8.2 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 65.3.0 pip: 22.2.2 conda: None pytest: None IPython: 8.4.0 sphinx: None
I have updated the bug with more information and an MCVE.
Is it possible to write to an open file object with netCDF4?
We are currently overwriting engine here but should raise an error if engine != 'scipy' WE should only set scipy if engine is None
https://github.com/pydata/xarray/blob/58ab594aa4315e75281569902e29c8c69834151f/xarray/backends/api.py#L1174-L1175
This would be a relatively simple pull request if you are up for it!
I get the same issue with h5netcdf, so the issue isn't unique to netCDF4.