Dipterix Wang
Dipterix Wang
For those who are interested in how I compare these two ways of loading data, here are profiling. ### Subset data when reading via `fst` ```r #### Subset using read_fst(...,...
In my test, I load **one column** and subset each time. Think if the file is ~200GB with 80 columns (2.5GB per column) and your RAM is only 4GB, you...
@MarcusKlik thank you for developing such great package and appreciate for the recently published `fstcore` package. I solved my issue by implementing cpp level control with fstcore ([`lazyarray`](https://github.com/dipterix/lazyarray)). This issue...
Uh... I thought fst once supported big-endian system, or Solaris is unsupported all the time? Also do you have any plan to enable big-endian support because neither LZ4 nor ZSTD...
I see. Is it possible to store pre-designed bytes within the meta block that indicates the endianess, for example one or multiple `00001111` (or something that has very low probability...
## `purge` before benchmark Hi @MarcusKlik . I was profiling my package these days and I found a very interesting thing (not sure if this was only on my system,...
To correct the comment above, I'm not sure whether >0.6~0.9 GB/s can only be reached via multi-threads or not, I was using multisession parallel in R to run a for...
Found in [R documentation](https://cran.r-project.org/doc/manuals/r-release/R-exts.html#OpenMP-support): > note that some toolchains used for R (including Apple’s for macOS and some others using clang31) have no OpenMP support at all, not even omp.h....
No, data.table also failed. There are two problems: 1. Apple's default tool sets support no `openmp`. This can be fixed by detecting whether brew has `llvm` and `libomp` installed and...
Hi @MarcusKlik, You might want to give Mac users some instructions on how to get these environment variable set up in your README. An alternative is to compile the code...