change compression to compressor in netCDF.3.translate zarr.create_dataset calls
Replaces 'compressionwith correctcompressor`argument in netCDF3.translate zarr.create_dataset calls.
Fixes #534
Tested with python 3.12 and kerchunk v.0.2.7 on RHEL8 OS
This is right, but I don't understand why no one hit it before! Is it possible to add a test in https://github.com/fsspec/kerchunk/blob/main/tests/test_netcdf.py which would have failed, but with this change passes?
@martindurant I'll see if I can update the netcdf unit tests to capture this behavior.
@martindurant It looks like test_netcdf.test_unlimited should catch the error if the zarr version is 3.0.0 or later. If I run it independently with Zarr v3 (I added a print statement to show the zarr version), I get the following output:
/net/Jessica.Liptak/miniconda3/envs/_MDTF_dev/bin/python3.12 /net/jml/pycharm-2024.3/plugins/python-ce/helpers/pycharm/_jb_pytest_runner.py --target test_netcdf.py::test_unlimited
Testing started at 10:38 AM ...
Launching pytest with arguments test_netcdf.py::test_unlimited --no-header --no-summary -q in /net/jml/kerchunk/tests
============================= test session starts ==============================
collecting ... collected 1 item
test_netcdf.py::test_unlimited
======================== 1 failed, 2 warnings in 1.12s =========================
FAILED [100%]
Running with Zarr 3.0.0
tests/test_netcdf.py:79 (test_unlimited)
unlimited_dataset = '/tmp/pytest-of-Jessica.Liptak/pytest-3/test_unlimited0/test.nc'
def test_unlimited(unlimited_dataset):
fn = unlimited_dataset
expected = xr.open_dataset(fn, engine="scipy")
h = netCDF3.NetCDF3ToZarr(fn)
> out = h.translate()
test_netcdf.py:84:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../kerchunk/netCDF3.py:194: in translate
arr = z.create_dataset(
/net/Jessica.Liptak/miniconda3/envs/_MDTF_dev/lib/python3.12/site-packages/typing_extensions.py:2853: in wrapper
return arg(*args, **kwargs)
/net/Jessica.Liptak/miniconda3/envs/_MDTF_dev/lib/python3.12/site-packages/zarr/core/group.py:2395: in create_dataset
return Array(self._sync(self._async_group.create_dataset(name, **kwargs)))
/net/Jessica.Liptak/miniconda3/envs/_MDTF_dev/lib/python3.12/site-packages/zarr/core/sync.py:187: in _sync
return sync(
/net/Jessica.Liptak/miniconda3/envs/_MDTF_dev/lib/python3.12/site-packages/zarr/core/sync.py:142: in sync
raise return_result
/net/Jessica.Liptak/miniconda3/envs/_MDTF_dev/lib/python3.12/site-packages/zarr/core/sync.py:98: in _runner
return await coro
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <AsyncGroup memory://22386488263744>, name = 'lat', shape = (10,)
kwargs = {'chunks': (10,), 'compression': None, 'dtype': dtype('>f4'), 'fill_value': None}
data = None
@deprecated("Use AsyncGroup.create_array instead.")
async def create_dataset(
self, name: str, *, shape: ShapeLike, **kwargs: Any
) -> AsyncArray[ArrayV2Metadata] | AsyncArray[ArrayV3Metadata]:
"""Create an array.
.. deprecated:: 3.0.0
The h5py compatibility methods will be removed in 3.1.0. Use `AsyncGroup.create_array` instead.
Arrays are known as "datasets" in HDF5 terminology. For compatibility
with h5py, Zarr groups also implement the :func:`zarr.AsyncGroup.require_dataset` method.
Parameters
----------
name : str
Array name.
**kwargs : dict
Additional arguments passed to :func:`zarr.AsyncGroup.create_array`.
Returns
-------
a : AsyncArray
"""
data = kwargs.pop("data", None)
# create_dataset in zarr 2.x requires shape but not dtype if data is
# provided. Allow this configuration by inferring dtype from data if
# necessary and passing it to create_array
if "dtype" not in kwargs and data is not None:
kwargs["dtype"] = data.dtype
> array = await self.create_array(name, shape=shape, **kwargs)
E TypeError: AsyncGroup.create_array() got an unexpected keyword argument 'compression'
/net/Jessica.Liptak/miniconda3/envs/_MDTF_dev/lib/python3.12/site-packages/zarr/core/group.py:1169: TypeError
Process finished with exit code 1
You'll see that I am using the kerchunk conda package. I have not specified the Zarr in this test environment, so Zarr 3.0.0 is installed by default.
I think this should be fixed in the latest release, which now only supports zarr3.