InMemoryDatasets.jl
InMemoryDatasets.jl copied to clipboard
Enhance `Characters` type
This is to track the issue with string in Julia
String
in Julia is not suitable for InMemoryDatasets, and Characters
(UInt8 - UInt16) currently is good only for strings up to 15 characters (compiling time issue with NTuples). InlineStrings
are restricted and wasteful. The idea is to have something similar to Characters
but instead of having NTuple we should have a vector of UInt8 with an attribute of length which fixes the length of each element, so vector of Characters{8}
with 10 elements should be a vector of 10*8 UInt8 plus an attribute of 8 which indicates each string in this vector is of length 8. For shorter strings they should be padded by space and for longer one they should be truncated to 8 characters or less if they are UTF.