netcdf-java icon indicating copy to clipboard operation
netcdf-java copied to clipboard

netcdf4 user defined types not handled correctly

Open JohnLCaron opened this issue 4 years ago • 2 comments

See TestNc4EnumWriting as an example.

JohnLCaron avatar Jul 07 '20 22:07 JohnLCaron

Problem with enums:

cdl:

netcdf writeEnumFromCdl { types: short enum dessertType_t {dirt = 0, pie = 18, donut = 268, cake = 3284}; dimensions: time = UNLIMITED; variables: dessertType_t dessert(time); }

ncgen -o C:/temp/writeEnumFromCdl.nc4 -k 3 C:/dev/github/netcdf-java/netcdf4/src/test/resources/ucar/nc2/jni/netcdf/enum.cdl

$ ncdump C:/temp/writeEnumFromCdl.nc4 netcdf C:/temp/writeEnumFromCdl { types: short enum dessertType_t {dirt = 0, pie = 18, donut = 268, cake = 3284} ; dimensions: time = UNLIMITED ; // (0 currently) variables: dessertType_t dessert(time) ; data: }

but looking at the hdf messages, dessert has a self-contained DataType message:

message type = Datatype(3); datatype= 8 byteSize= 2 NCtype= null flags= 4 0 0 endian= LITTLE enumTypeName= dessert parent base= { datatype= 0 byteSize= 2 NCtype= short flags= 8 0 0 endian= LITTLE}

Note that enumTypeName= dessert, not dessertType_t !!

There is a seperate datatype message for the "dessertType_t" "user type":

message type = Datatype(3); datatype= 8 byteSize= 2 NCtype= null flags= 4 0 0 endian= LITTLE enumTypeName= dessertType_t parent base= { datatype= 0 byteSize= 2 NCtype= short flags= 8 0 0 endian= LITTLE}

checking with h5dump:

$ h5dump C:/temp/writeEnumFromCdl.nc4 HDF5 "C:/temp/writeEnumFromCdl.nc4" { GROUP "/" { ATTRIBUTE "_NCProperties" { DATATYPE H5T_STRING { STRSIZE 34; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SCALAR DATA { (0): "version=2,netcdf=4.7.4,hdf5=1.10.6" } } DATASET "dessert" { DATATYPE H5T_ENUM { H5T_STD_I16LE; "dirt" 0; "pie" 18; "donut" 268; "cake" 3284; } DATASPACE SIMPLE { ( 0 ) / ( H5S_UNLIMITED ) } DATA { } ATTRIBUTE "DIMENSION_LIST" { DATATYPE H5T_VLEN { H5T_REFERENCE { H5T_STD_REF_OBJECT }} DATASPACE SIMPLE { ( 1 ) / ( 1 ) } DATA { (0): (DATASET 419 /time ) } } } DATATYPE "dessertType_t" H5T_ENUM { H5T_STD_I16LE; "dirt" 0; "pie" 18; "donut" 268; "cake" 3284; }; DATASET "time" { DATATYPE H5T_IEEE_F32BE DATASPACE SIMPLE { ( 0 ) / ( H5S_UNLIMITED ) } DATA { } ATTRIBUTE "CLASS" { DATATYPE H5T_STRING { STRSIZE 16; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SCALAR DATA { (0): "DIMENSION_SCALE" } } ATTRIBUTE "NAME" { DATATYPE H5T_STRING { STRSIZE 64; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SCALAR DATA { (0): "This is a netCDF dimension but not a netCDF variable. 0" } } ATTRIBUTE "REFERENCE_LIST" { DATATYPE H5T_COMPOUND { H5T_REFERENCE { H5T_STD_REF_OBJECT } "dataset"; H5T_STD_I32LE "dimension"; } DATASPACE SIMPLE { ( 1 ) / ( 1 ) } DATA { (0): { DATASET 774 /dessert , 0 } } } ATTRIBUTE "_Netcdf4Dimid" { DATATYPE H5T_STD_I32LE DATASPACE SCALAR DATA { (0): 0 } } } } }

So, the enum variable "dessert" has no internal reference to the user type "dessertType_t", but instead, replicates the enum inside of itself. I tried adding another enum variable, and it also replicated the enum inside of it.

So Im wondering, do you allow the user type to change, and if so, do all the replicated enums change? Can I assume that theres always a top level enum that matches the replicated ones in the name/value map? Is this replication an hdf5 thing or a netcdf4 thing?

JohnLCaron avatar Sep 15 '20 22:09 JohnLCaron

Fixed problems with reading enums in netcdf4 files (I hope) in PR #479 for version 6. Wouldnt be hard to backport for ver5.

JohnLCaron avatar Sep 16 '20 14:09 JohnLCaron