Unable to combine .nc files using ep.combine_echodata()
I am probably doing something stupid, but I have been unable to figure this one out. My goal is to combine several netCDF files into one, but I've been getting an error:
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/home/tkeffer/git/westernflyer/ek80/issue.py", line 10, in <module>
combined_ed.to_netcdf("test.nc")
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/echopype/echodata/echodata.py", line 612, in to_netcdf
return to_file(
^^^^^^^^
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/echopype/convert/api.py", line 88, in to_file
_save_groups_to_file(
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/echopype/convert/api.py", line 118, in _save_groups_to_file
io.save_file(
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/echopype/utils/io.py", line 72, in save_file
ds.to_netcdf(path=path, mode=mode, group=group, encoding=encoding, **kwargs)
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/xarray/core/dataset.py", line 2102, in to_netcdf
return to_netcdf( # type: ignore[return-value] # mypy cannot resolve the overloads:(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/xarray/backends/api.py", line 2107, in to_netcdf
dump_to_store(
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/xarray/backends/api.py", line 2157, in dump_to_store
store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/xarray/backends/common.py", line 529, in store
self.set_variables(
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/xarray/backends/common.py", line 567, in set_variables
target, source = self.prepare_variable(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/xarray/backends/netCDF4_.py", line 582, in prepare_variable
encoding = _extract_nc4_variable_encoding(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkeffer/git/westernflyer/ek80/venv/lib/python3.12/site-packages/xarray/backends/netCDF4_.py", line 311, in _extract_nc4_variable_encoding
raise ValueError(
ValueError: unexpected encoding parameters for 'netCDF4' backend: ['szip', 'zstd', 'bzip2', 'blosc', 'preferred_chunks']. Valid encodings are: {'fletcher32', 'zlib', '_FillValue', 'szip_coding', 'least_significant_digit', 'chunksizes', 'endian', 'shuffle', 'significant_digits', 'contiguous', 'blosc_shuffle', 'dtype', 'quantize_mode', 'szip_pixels_per_block', 'complevel', 'compression'}
Environment:
Ubuntu 24.04 Python 3.12.5 echopype 0.10.1 pandas: 2.3.2 numpy: 1.26.4 xarray: 2025.9.0 Sonar: EK80
To reproduce
Sample data set
This historical sample data set (about 300 MB) will reproduce the problem, but I have had the same problem with my own data.
wget https://noaa-wcsd-pds.s3.amazonaws.com/data/raw/Bell_M._Shimada/SH2209/EK80/Express-D20220904-T011422.raw https://noaa-wcsd-pds.s3.amazonaws.com/data/raw/Bell_M._Shimada/SH2209/EK80/Express-D20220904-T012014.raw https://noaa-wcsd-pds.s3.amazonaws.com/data/raw/Bell_M._Shimada/SH2209/EK80/Express-D20220904-T012606.raw
Convert
This was then converted to nc files using this:
import glob
import echopype as ep
raw_files = glob.glob("*.raw")
for raw_file in raw_files:
ed = ep.open_raw(raw_file, sonar_model="EK80", use_swap=True)
ed.to_netcdf(raw_file.replace(".raw", ".nc"))
Combine
Then combine the nc files using this:
import glob
import echopype as ep
ed_filenames = sorted(glob.glob("*.nc"))
ed_list = []
for ed_filename in ed_filenames:
ed_list.append(ep.open_converted(ed_filename))
combined_ed = ep.combine_echodata(ed_list)
combined_ed.to_netcdf("test.nc")
Hi @tkeffer,
Indeed, I can reproduce the error too! I'm not using netcdf format so much with echopype, but if we take it step by step:
-
In case that helps in the meantime, and you do not necessarily need NetCDF format, you could use Zarr format instead of NetCDF, and it will work to combine them and save a combined file.
-
If we focus back on the NetCDF problem, to be sure we’re on the same page:
- After
combined_ed = ep.combine_echodata(ed_list)we can still access and plot the data.
ds_Sv_nc = ep.calibrate.compute_Sv(combined_ed, waveform_mode="CW", encode_mode="power")
ds_Sv_nc["Sv"].plot( x="ping_time", row="channel", col_wrap=3, vmin=-80, vmax=-30, cmap="RdYlBu_r", yincrease=False )
-
The problem emerges only at the end when we try to save the multiple .nc files. The error is associated with variable encodings/attributes that the netCDF4 engine doesn’t accept. From what I understand, when EchoData objects are read and combined, some variables end up carrying Zarr-style encoding. These keys are valid for Zarr but invalid for the netCDF4 backend.
-
Separately, it seems that
combine_echodata()stamps Provenance.attrs["is_combined"] = True. The netCDF4 library doesn’t have a boolean attribute type, so xarray raises.
I looked a bit but did not find where those encodings/attrs were attached. For now, aside from using the Zarr format, one temporary workaround would be removing these encodings/attributes manually:
import numpy as np
# 1) wipe all per-variable encodings so xarray won't see Zarr-ish keys
for grp in sorted(combined_ed.group_paths):
path = "/" if grp == "Top-level" else grp
try:
ds = combined_ed._tree[path].ds
except KeyError:
continue
for v in ds.variables:
ds[v].encoding.clear() # <- removes 'blosc', 'zstd', 'preferred_chunks', etc.
# also fix any boolean attrs (netCDF4 can't store bool attrs)
for k, v in list(ds.attrs.items()):
if isinstance(v, (bool, np.bool_)):
ds.attrs[k] = int(v)
# 2) write
combined_ed.to_netcdf("./data/combined_nc/combined.nc", overwrite=True)
This still needs further investigation, as the encodings appear even on a single .nc file, without the combine_echodata() call.
Hope it helps for now!
For note: seems also connected to those issues #1092 #479 #975 and to this PR #1042