fst icon indicating copy to clipboard operation
fst copied to clipboard

Multi threading phase 3: serialize multiple columns simultaneously

Open fstpackage opened this issue 8 years ago • 0 comments

Some column-types can be serialized more efficiently than others. Character vector serialization is slow compared to integer, double, logical and factor serialization. We can process serialization of a character vector in parallel with the serialization of a column of another type. Combining serialization of a slow column with the serialization of a fast column enables maximum throughput of data and keeps the IO device busy.

In short: combine a column that can be (de-) serialized faster than IO speed with a column that can be (de-) serialized at lower than IO speed and process in parallel.

fstpackage avatar May 29 '17 21:05 fstpackage