cf-python hdf5 chunk defaults and handling

hdf5 chunk defaults and handling

Open bnlawrence opened this issue 1 year ago • 0 comments

The current behaviour when reading a chunked file is somewhat surprising (to me). If one reads this variable:

float UM_m01s02i205_vn1106(time, latitude, longitude) ;
		# skip uninteresting attributes for this issue
		UM_m01s02i205_vn1106:_Storage = "chunked" ;
		UM_m01s02i205_vn1106:_ChunkSizes = 1, 1920, 2560 ;

I see the following unexpected result:

In [30]: g = cf.read('double-chunking-testc.nc')[0]
In [31]: g.data.nc_hdf5_chunksizes()
Out[31]: ()

This is not a bug, insofar as it is the expected behaviour of the code - by construction cf-python currently doesn't remember HDF chunksizes from the read.

Should it? If so, it could be done, possibly with certain caveats on when that's a sensible thing to do, and it may well forget them when certain operations are applied (e.g. when aggregating files with different HDF chunks, when subspacing, when adding/removing/transposing dimensions, etc.).

Another V4.0 issue!

Jun 05 '24 08:06 bnlawrence

cf-python cf-python copied to clipboard

hdf5 chunk defaults and handling

cf-python
cf-python copied to clipboard