hdf5
hdf5 copied to clipboard
Improve float16 performance
Using HDF5 to read data stored as 16-bit floating point into a 32-bit buffer is extremely slow, around 16x slower than an equivalent conversion in numpy. I uploaded a demo here. For simplicity I used h5py, but one can obtain the same result using the HDF5 C API. HDF5 also seems to discard any payload bits in NaN values. I suspect the slowdown is due to the very general implementation for custom float types in HDF5 here
https://github.com/HDFGroup/hdf5/blob/306db409d44cccbeaff1cd5acb1a99173ac8b185/src/H5Tconv.c#L4267-L4271
versus the float16-specific handling in numpy.
The case I really care about involves a structured data type (for complex values), which is 44x slower than a numpy workaround. That demo is available here, though I haven't isolated a cause for that extra factor of 3x.
It seems like ideally there'd be a H5T__conv_half_single
routine that uses hardware to convert from _Float16
(example). I guess this might require adding a native_half
type, which seems like a big job. Or maybe just a special case in H5T__conv_f_f
?
I implemented a demo using just the C API, available here
https://github.com/bhawkins/demo_hdf5_c4
Profiling confirms that the slowdown is indeed in H5T__conv_f_f
, which is more than 30 times slower than whatever clang does.
As context, this issue is highly relevant to an upcoming NASA mission called NISAR. It is an imaging radar that will soon produce a freely available, global dataset of several petabytes. The data has high dynamic range and high entropy, so float16 encoding is an appealing solution to reduce file sizes.
Software support for float16 varies, and in several scenarios the obvious or default behavior is to use the HDF5 API to convert to float32 on read, which gets bogged down as in the above demos. This is notably the behavior of GDAL, which forms the basis of a wide variety of GIS applications. So while it is possible to work around this problem on an ad hoc basis in each application, there would be a potentially wide-ranging benefit to simply making the libhdf5 code path faster.
We plan to add native float16 support but it probably won't be ready until 1.14.5
We just had some recent interest in this in https://github.com/JuliaIO/HDF5.jl/pull/341#issuecomment-1904214027 . It would be great if there was native float16 and bfloat16 support.
Hi @bhawkins and @mkitti, if you happen to get the chance it would be appreciated if you could look over the RFC for 16-bit float (and complex number) support at https://forum.hdfgroup.org/t/hdf5-rfc-adding-support-for-16-bit-floating-point-and-complex-number-datatypes-to-hdf5/11975 and give any feedback that you may have in that forum thread. Thanks!