zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

Issues converting HDF5/NetCDF file to Zarr

Open drentawc opened this issue 3 years ago • 0 comments

Zarr version

2.13.3

Numcodecs version

0.10.2

Python Version

3.10.4

Operating System

Mac

Installation

Using Conda

Description

I am trying to convert HDF files to Zarr in order to use them with NetCDF-Java as it currently only supports the Zarr v2 spec. I have previously used xarray to do this type of conversion when working with Python code but need to specifically use the zarr library for the NetCDF-Java code. I saw the code snippet on the readthedocs website for Zarr and came across the example to convert a HDF5 file to Zarr with the copy_all() method but for every hdf file I have tried I am getting a "is not JSON serializable" error. I saw there was another similar issue but it was a couple years ago and I wanted to see if there were any other ways to convert this type of data more effectively. Thanks in advance

Steps to reproduce

hdf = h5py.File(path, mode='r') dest = zarr.open_group('outputs/netcdfjava.zarr', mode='w')

zarr.copy_all(hdf, dest, log=stdout)

Additional output

<KeysViewHDF5 ['altitude', 'coord_ref', 'l2_lat', 'l2_lon', 'l2_time', 'latitude', 'longitude', 'nv', 'sss', 'sss_dif', 'time', 'time_bnds']> copy /altitude (1,) float64 Traceback (most recent call last): File "/Users/will/Documents/GitHub/ncZarrTest/zarrWriter.py", line 146, in main() File "/Users/will/Documents/GitHub/ncZarrTest/zarrWriter.py", line 21, in main createNcZarr(parser.path) File "/Users/will/Documents/GitHub/ncZarrTest/zarrWriter.py", line 52, in createNcZarr zarr.copy_all(hdf, dest, log=stdout) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/site-packages/zarr/convenience.py", line 1147, in copy_all c, s, b = _copy( File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/site-packages/zarr/convenience.py", line 985, in _copy ds.attrs.update(source_attrs) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/site-packages/zarr/attrs.py", line 179, in update self._write_op(self._update_nosync, *args, **kwargs) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/site-packages/zarr/attrs.py", line 84, in _write_op return f(*args, **kwargs) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/site-packages/zarr/attrs.py", line 193, in _update_nosync self._put_nosync(d) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/site-packages/zarr/attrs.py", line 154, in _put_nosync self.store[self.key] = json_dumps(d) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/site-packages/zarr/util.py", line 50, in json_dumps return json.dumps(o, indent=4, sort_keys=True, ensure_ascii=True, File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/json/init.py", line 238, in dumps **kw).encode(obj) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/json/encoder.py", line 201, in encode chunks = list(chunks) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/json/encoder.py", line 438, in _iterencode o = default(o) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/site-packages/zarr/util.py", line 45, in default return json.JSONEncoder.default(self, o) File "/Users/will/opt/miniconda3/envs/zarr/lib/python3.10/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.class.name} ' TypeError: Object of type bytes is not JSON serializable

drentawc avatar Oct 17 '22 09:10 drentawc