netcdf4-python icon indicating copy to clipboard operation
netcdf4-python copied to clipboard

netcdf4-python creates fixed length unicode string attributes in netcdf4 files

Open shoyer opened this issue 9 years ago • 1 comments

Over on the h5py issue tracker @krischer has noticed incompatibilities with strings set with setncattr by newer versions of netCDF4-python: https://github.com/h5py/h5py/issues/719#issuecomment-233380039

These strings are saved in the HFD5 file as fixed width unicode:

      DATATYPE  H5T_STRING {
         STRSIZE 6;
         STRPAD H5T_STR_NULLTERM;
         CSET H5T_CSET_UTF8;
         CTYPE H5T_C_S1;
      }

I'm guessing this is related to the recent rewrite of string attribute handling in https://github.com/Unidata/netcdf4-python/issues/529.

This very strange. netCDF-C reports that these are character arrays (NC_CHAR), but clearly they're saved as UTF-8 in the HDF5 file. Perhaps this is a bug in netCDF-C?

cc @mangecoeur

shoyer avatar Jul 28 '16 02:07 shoyer

CC @WardF, because I think this is a bug in netCDF-C. I was able to reproduce this with netCDF-C v4.4.0 installed.

shoyer avatar Jul 28 '16 15:07 shoyer