fstlib
fstlib copied to clipboard
A C++ library for lightning fast multi-threaded serialization of tabular data. Home to the `fst` file format.
I was looking into lib/factor/factor_v7.cpp and see code like `if (*nrOfLevels < 128)`. In the comment it says > // use 1 byte per int (Na encoding takes 1 bit)...
I know it's going to be a bit of work, but a full-description of the fst format will help build connectors into it. From Julia, Python, and any other programming...
To enable client applications to effectively implement the `fstlib` library
To include only the lib subdirectory. Also, add a `coveralls` banner to the homepage.
Currently, there is no clear documentation on how to setup a `C++` project using the `fstlib` library. A sample `C++` project would be a good starting point for potential users....
Explaining the goals behind the `fstlib` library and the differences with the `arrow`/`parguet` philosophy.
Is linux supported out of the box? If yes, what is the recommended way to compile on linux?
See for example [here](https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads). To lower memory requirements, `fstlib` allocates larger blocks of memory that are written to by several threads. In such cases, cache line pollution must be avoided....
First thanks for the library! What is the recommended approach to write large datasets (e.g. 20+ GB csv files). Is there any way to stream reading / writing ? I...