netcdf-c
netcdf-c copied to clipboard
Want an option to ignore variables of unsupported data type when opening zarr files
Our team is working with earth science zarr data that has some variable metadata stored as a string object type. These variables contain strings such as source URLs that correspond to each chunk. We are aware that in the documentation, string types are not supported as variables. This is fine. Because this field is just additional metadata for internal purposes, they are not necessary for our use case of netCDF4. We want it so that the netCDF4 code can either ignore the string variables, throw a warning that string variables are not supported, or simply have limited functionality for string variables. We just don't want the code to break when we open the zarr file.
Out of curiosity, why store these URLs (which I assume have no dimensionality) as global attributes instead of variables?
Any chance you could send me one of those files in either zip format or as a tar'd directory? Also, I have been working at a low level on adding fixed size string support to nczarr, Are you willing to act as a test case for it?
These URLs do have dimensionality. They correspond to each time chunk and contain source information from where the individual file was downloaded.
And yes, @DennisHeimbigner , we would be open to being a test case for string support.
Here is the file that has the issue. It has a bogus URL for now, but it has the same dimensions as the time variable.
In case you are interested in replicating the issue, the error that I get when opening this file is "Assertion failed: (type && type->format_type_info != NULL), function zclose_type, file zclose.c, line 228."
@amberjungminlee Ah, that makes sense then.
This PR (https://github.com/Unidata/netcdf-c/pull/2467) is an experimental draft PR that attempts to add Zarr/Numpy fixed size string support to NCZarr.
Fixed by https://github.com/Unidata/netcdf-c/pull/2492