xarray icon indicating copy to clipboard operation
xarray copied to clipboard

Engine parameter ignored when writing NetCDF4 if writing to a file handle.

Open hameer-spire opened this issue 3 years ago • 5 comments

What happened?

I tried to call ds.to_netcdf(f, engine="netcdf4", format="NETCDF4") and got an

ValueError: invalid format for scipy.io.netcdf backend: 'NETCDF4'

What did you expect to happen?

I expected the write to succeed.

Minimal Complete Verifiable Example

import numpy as np
import xarray as xr

arr_thr = np.random.random_sample((2, 100, 100))

x_var = np.arange(100)
y_var = np.arange(100)

ds = xr.Dataset(
    {
        "low_threshold": (["x", "y"], arr_thr[0], {"units": "dB"}),
        "high_threshold": (["x", "y"], arr_thr[1], {"units": "dB"}),
    },
    coords={
        "x_dist": (["x"], x_var, {"units": "m"}),
        "y_dist": (["y"], y_var, {"units": "m"}),
    },
)

with open("/tmp/test.nc", "wb") as f:
    ds.to_netcdf(f, engine="netcdf4", format="NETCDF4")

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?1f881801-9039-4b0c-abe7-f26b42b290b9)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [9], line 2
      1 with open("/tmp/test.nc", "wb") as f:
----> 2     ds.to_netcdf(f, engine="netcdf4", format="NETCDF4")

File ~/mambaforge/envs/eiland-dev/lib/python3.10/site-packages/xarray/core/dataset.py:1882, in Dataset.to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf)
   1879     encoding = {}
   1880 from ..backends.api import to_netcdf
-> 1882 return to_netcdf(  # type: ignore  # mypy cannot resolve the overloads:(
   1883     self,
   1884     path,
   1885     mode=mode,
   1886     format=format,
   1887     group=group,
   1888     engine=engine,
   1889     encoding=encoding,
   1890     unlimited_dims=unlimited_dims,
   1891     compute=compute,
   1892     multifile=False,
   1893     invalid_netcdf=invalid_netcdf,
   1894 )

File ~/mambaforge/envs/eiland-dev/lib/python3.10/site-packages/xarray/backends/api.py:1193, in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf)
   1189     else:
...
--> 138     raise ValueError(f"invalid format for scipy.io.netcdf backend: {format!r}")
    140 if lock is None and mode != "r" and isinstance(filename_or_obj, str):
    141     lock = get_write_lock(filename_or_obj)

ValueError: invalid format for scipy.io.netcdf backend: 'NETCDF4'

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] python-bits: 64 OS: Linux OS-release: 5.15.0-1019-aws machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1

xarray: 2022.6.0 pandas: 1.4.4 numpy: 1.23.2 scipy: 1.9.1 netCDF4: 1.6.0 pydap: None h5netcdf: 1.0.2 h5py: 3.7.0 Nio: None zarr: None cftime: 1.6.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: 2022.8.2 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 65.3.0 pip: 22.2.2 conda: None pytest: None IPython: 8.4.0 sphinx: None

hameer-spire avatar Sep 06 '22 14:09 hameer-spire

I have updated the bug with more information and an MCVE.

hameer-spire avatar Oct 03 '22 13:10 hameer-spire

Is it possible to write to an open file object with netCDF4?

We are currently overwriting engine here but should raise an error if engine != 'scipy' WE should only set scipy if engine is None

https://github.com/pydata/xarray/blob/58ab594aa4315e75281569902e29c8c69834151f/xarray/backends/api.py#L1174-L1175

This would be a relatively simple pull request if you are up for it!

dcherian avatar Oct 03 '22 20:10 dcherian

I get the same issue with h5netcdf, so the issue isn't unique to netCDF4.

hameer-spire avatar Oct 04 '22 06:10 hameer-spire