cubed icon indicating copy to clipboard operation
cubed copied to clipboard

`to_zarr` - AttributeError: 'numpy.dtypes.Float64DType' object has no attribute 'dtype'

Open songhan89 opened this issue 9 months ago • 4 comments

Hi,

I ran into an error when trying to save the xarray to zarr. It appears that if i load the xarray dataset using dask chunk, this problem somehow does not happen. Would need some advice how to resolve this. Thank you !

Description

import earthkit.data
import xarray as xr
import zarr
import adlfs

earth_ds = earthkit.data.from_source_lazily('file', files)

args_cubed = {'engine': 'cfgrib',
    'filter_by_keys': {
        'dataType': 'fc',
        'typeOfLevel': ['surface', 'isobaricInhPa']
        },
    'chunked_array_type': 'cubed',
    'chunks': {}
 }

args_dask = {'engine': 'cfgrib',
    'filter_by_keys': {
        'dataType': 'fc',
        'typeOfLevel': ['surface', 'isobaricInhPa']
        },
    'chunks': {}
 }

Loading data with cubed and dask chunks respectively

import cubed

ds = earth_ds.to_xarray(**args_cubed,
                        from_array_kwargs={'spec': cubed.Spec(allowed_mem='3.5GB')})

ds_dask = earth_ds.to_xarray(**args_dask)

Image

Saving xarray to zarr

# this would fail
ds.to_zarr(store='test.zarr', mode='w', zarr_format=3, compute=True, consolidated=False)

# this is fine
ds_dask.to_zarr(store='test.zarr', mode='w', zarr_format=3, compute=True, consolidated=False)

Error Message

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[8], [line 1](vscode-notebook-cell:?execution_count=8&line=1)
----> [1](vscode-notebook-cell:?execution_count=8&line=1) ds.to_zarr(store='test.zarr', mode='w', zarr_format=3, compute=True, consolidated=False)

File ~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2622, in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, zarr_format, write_empty_chunks, chunkmanager_store_kwargs)
   [2454](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2454) """Write dataset contents to a zarr group.
   [2455](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2455) 
   [2456](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2456) Zarr chunks are determined in the following way:
   (...)
   [2618](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2618)     The I/O user guide, with more details and examples.
   [2619](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2619) """
   [2620](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2620) from xarray.backends.api import to_zarr
-> [2622](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2622) return to_zarr(  # type: ignore[call-overload,misc]
   [2623](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2623)     self,
   [2624](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2624)     store=store,
   [2625](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2625)     chunk_store=chunk_store,
   [2626](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2626)     storage_options=storage_options,
   [2627](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2627)     mode=mode,
   [2628](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2628)     synchronizer=synchronizer,
   [2629](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2629)     group=group,
   [2630](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2630)     encoding=encoding,
   [2631](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2631)     compute=compute,
   [2632](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2632)     consolidated=consolidated,
   [2633](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/xarray/core/dataset.py:2633)     append_dim=append_dim,
...
---> [68](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/cubed/array_api/creation_functions.py:68)     dtype = a.dtype
     [70](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/cubed/array_api/creation_functions.py:70) chunksize = to_chunksize(normalize_chunks(chunks, shape=a.shape, dtype=dtype))
     [71](https://file+.vscode-resource.vscode-cdn.net/Users/songhanwong/Projects/cubed-issue/~/anaconda3/envs/py311/lib/python3.11/site-packages/cubed/array_api/creation_functions.py:71) name = gensym()

AttributeError: 'numpy.dtypes.Float64DType' object has no attribute 'dtype'

References

Sample files

songhan89 avatar Jan 21 '25 12:01 songhan89