HDF5.jl icon indicating copy to clipboard operation
HDF5.jl copied to clipboard

Better storage of vectors of short strings

Open simonster opened this issue 11 years ago • 1 comments

It would be more efficient to store large arrays of very short strings using a fixed length string type, since the overhead of each variable length string is many bytes.

simonster avatar Aug 09 '14 18:08 simonster

Sounds fine to me. Presumably since IO is a bottleneck we could afford to do a certain amount of analysis: for strings that are not a leaf type, determine if a tighter type is possible, optimize packing, etc.

timholy avatar Aug 09 '14 19:08 timholy