fst icon indicating copy to clipboard operation
fst copied to clipboard

Adjacent column with identical types are stored as a matrix internally

Open MarcusKlik opened this issue 8 years ago • 1 comments
trafficstars

That would significantly reduce the overhead when these columns are selected, especially when they are selected in the order in which they were stored

MarcusKlik avatar Jul 14 '17 05:07 MarcusKlik

It's probably more efficient to write separate code for serializing matrices. We could do that by writing the whole matrix to a single column fst file, but then, random access reading would require many seek operations during read. A better method would be to store the matrix in 'square' blocks of data, with e.g. 1024 rows and 1024 columns. Creating the blocks would require reordering of the vector data, but that can be done in parallel using OpenMP and will probably be faster than the background IO (so no delay is expected for zero compression writes)

MarcusKlik avatar Sep 22 '17 07:09 MarcusKlik