bokeh
bokeh copied to clipboard
[FEATURE] Serialize float16 ndarrays using base64 strings like other float dtypes?
Problem description
I use html files to save visualizations containing several images and a control widget.
In an effort to reduce the size of exported html files, I tried to work with np.float16 arrays as precision is not really a concern here. To my surprise, the resulting file was larger than the file containing float64 data. This is because float64 (and float32) arrays are serialized to binary using base64 strings while float16 are represented in decimal form with their digits (i.e. like their __repr__).
Looking at the relevant code (serialization.py), I can see that float16 arrays are not treated the same way float64/32 arrays are. The list of dtypes supported for binary format serialization is defined here and does not include np.float16.
Maybe there is a reason for this? I can see for instance that np.int64 and np.uint64 dtypes are commented out, so presumably there is a reason to exclude them.
Feature description
If there is no specific reason to exclude np.float16 from the supported dtypes, I imagine that it would be straightforward to add it. It would be more efficient, as it is for other dtypes.
There is no Float16Array in JS, so if we serialized as binary, then we wouldn't be able to do anything with this data. We could convert it from float16 to float32/64 in bokehjs, but the procedure for that is quite complex and expensive, if performed in JS. Perhaps something to consider when will start using WASM in bokehjs. WASM doesn't support 16-bit floats either, though there are proposals for that, so we still would have to do some work, but it should be much more reasonable if done in a language (Rust or C++) and environment better suited for this kind of low-level operations.
There is no Float16Array in JS
Ok thanks, that explains it all. As I hardly know any JS, I did not anticipate this kind of limitation.