fst
fst copied to clipboard
Lightning Fast Serialization of Data Frames for R
Timing measurements with microsecond accuracy are needed to analyse the performance of OpenMP parallel constructs in the core code of `fst`. We need to determine the speedup due to parallel...
Some column-types can be serialized more efficiently than others. Character vector serialization is slow compared to integer, double, logical and factor serialization. We can process serialization of a character vector...
When reading a `fst` file using multiple cores, the slowest operation is the creation of character vectors (and to some extend also factors). That's because R uses a global string...
Hi Marcus, I've just learned about your package, and its performance on the benchmarks looks absolutely impressive! However could you please clarify some details about the test environment: * What...
Processing character columns is by far the slowest of all data types. For character columns (that are not completely random) we can solve this problem by first converting the vector...
Consider the following example: I create a data.table of 1.22GB and save it to disk. Using `fst::read_fst()` does not cause memory allocation more than the data. However, subsetting `fst` object...
Is any chance for "list" support ? library(data.table); library(fst) nr_of_rows
cbind table
Hi I have datasets that is ~1TB for each projects which is too large for a 32GB RAM. I was wondering if there's a way to cbind fst table in...
I find fst_table a very useful class, do not have to read the file physically but could get enough information to know how to process. Perhaps there could be more...
A _fst\_table_ object is expected to mimic the behavior of a _data.frame_. Therefore, `df[0, ]` syntax should return a zero-row table without an error or warning: ``` r tmp_file 1...