fst
fst copied to clipboard
How to extract contents from a fst file when R crashes reading it
Hey everyone,
first of all thank you for this package, which is quite helpful in our work. For the first time after writing and reading a lot files already, I now experience a problem.
Trying to read a 12 GB fst file (using: read_fst(path_fstfile)
), R crashes. The error message is: "R Session Aborted. R encountered a fatal error. The session was terminated."
This can be reproduces on different computers and from different sources (network, local drive). It is independent from whether data.table is loaded as well or not. It is furthermore independent from whether the script is called through RStudio or through the command line using Rscript.exe. There is sufficient memory available (more than 100 GB RAM). Other fst files can be read successfully.
metadata_fst() works well on this file (see output below).
Is there any method to retrieve the contents of this file?
Thank you in advance for your help. Gabriel
> metadata_fst(path_fstfile)
<fst file>
120534568 rows, 43 columns (demandsimulationResult.fst)
* 'tripId' : integer
* 'legId' : integer
* 'personnumber' : integer
* 'householdOid' : integer
* 'personOid' : integer
* ....
Note: other columns are of type character, double and logical.
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)
Matrix products: default
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] fst_0.9.4
loaded via a namespace (and not attached):
[1] compiler_4.1.0 parallel_4.1.0 tools_4.1.0 Rcpp_1.0.7
Note: On another computer with R version 4.1.2 the error occurs as well.
Have you tried incrementally reading parts of the file? E.g.
read_fst(path_fstfile, from=1, to=100)
read_fst(path_fstfile, from=100, to=1000)
read_fst(path_fstfile, from=120534468)
Hi @gabowi, did you check your memory consumption while the fst
file is loading from disk? This sounds like your system doesn't have enough memory to read this file but that shouldn't crash R
. Were the partial reads suggested by @fox34 successful?