kerchunk
kerchunk copied to clipboard
How to avoid variables with no dimensions from picking up concat_dim?
@martindurant , I've got a collection of netcdf files I'm combining with kerchunk along the ocean_time
dimension.
These files have model constants stored as single value variables without a dimension.
When I combine them, these variables are picking up the ocean_time
dimension, but they should not:
How can I avoid this?
Reproducible notebook here: https://nbviewer.org/gist/e0c74a4a7e947e5d04fa3e82147ff146
Is the ocean_time actually the same in all of these?
These variables should not have an ocean_time dimension at all. But they are acquiring one through the concat process.
How should the output look?
The combined dataset should leave single value constants as constants, with no dimensions. At least in this case. Because the constants are same in every file.
I guess there could be cases where the constants change in every file and you would want them to have a time dimension.
But that's not the case here.
OK, so sounds like "dstart" should be in the "identical_dims"
I added "dstart" to the "identical_dims" but got the same result. Perhaps because "dstart" doesn't have dims?
I can't look into it right now, but ping me next week. So are all of the values of dstart
the same? If is a plausible dimension for future expansion with more variables (i.e., maybe it should be a concat_coord?)
By adding "dstart"
to identical_dims in MultiZarrToZarr, I got the result you were after
mzz = MultiZarrToZarr(json_list,
remote_protocol = 's3',
remote_options = opts,
target_options = opts,
concat_dims = ['ocean_time'],
identical_dims=['lat_psi','lat_rho','lat_u','lat_v',
'lon_psi','lon_rho','lon_u','lon_v', "dstart"])
->
<xarray.DataArray 'dstart' ()>
array('2022-07-29T00:00:00.000000000', dtype='datetime64[ns]')
Attributes:
long_name: time stamp assigned to model initilization
close?