netcdf-c icon indicating copy to clipboard operation
netcdf-c copied to clipboard

Want an option to ignore variables of unsupported data type when opening zarr files

Open amberjungminlee opened this issue 2 years ago • 6 comments

Our team is working with earth science zarr data that has some variable metadata stored as a string object type. These variables contain strings such as source URLs that correspond to each chunk. We are aware that in the documentation, string types are not supported as variables. This is fine. Because this field is just additional metadata for internal purposes, they are not necessary for our use case of netCDF4. We want it so that the netCDF4 code can either ignore the string variables, throw a warning that string variables are not supported, or simply have limited functionality for string variables. We just don't want the code to break when we open the zarr file.

amberjungminlee avatar Jul 27 '22 18:07 amberjungminlee

Out of curiosity, why store these URLs (which I assume have no dimensionality) as global attributes instead of variables?

dopplershift avatar Jul 27 '22 18:07 dopplershift

Any chance you could send me one of those files in either zip format or as a tar'd directory? Also, I have been working at a low level on adding fixed size string support to nczarr, Are you willing to act as a test case for it?

DennisHeimbigner avatar Jul 27 '22 19:07 DennisHeimbigner

These URLs do have dimensionality. They correspond to each time chunk and contain source information from where the individual file was downloaded.

And yes, @DennisHeimbigner , we would be open to being a test case for string support.

Here is the file that has the issue. It has a bogus URL for now, but it has the same dimensions as the time variable.

In case you are interested in replicating the issue, the error that I get when opening this file is "Assertion failed: (type && type->format_type_info != NULL), function zclose_type, file zclose.c, line 228."

generated.zip

amberjungminlee avatar Jul 28 '22 14:07 amberjungminlee

@amberjungminlee Ah, that makes sense then.

dopplershift avatar Jul 28 '22 17:07 dopplershift

This PR (https://github.com/Unidata/netcdf-c/pull/2467) is an experimental draft PR that attempts to add Zarr/Numpy fixed size string support to NCZarr.

DennisHeimbigner avatar Aug 01 '22 20:08 DennisHeimbigner

Fixed by https://github.com/Unidata/netcdf-c/pull/2492

DennisHeimbigner avatar Aug 28 '22 04:08 DennisHeimbigner