microdadosBrasil icon indicating copy to clipboard operation
microdadosBrasil copied to clipboard

add support for out of memmory datasets

Open lucasmation opened this issue 8 years ago • 5 comments

maybe using : ffdf

we should search if a newer simple aproach existis

lucasmation avatar Jun 15 '16 17:06 lucasmation

I think the best approach would be MonetDBlite, fast and out of memory. check this talk at useR2016 conference

lucasmation avatar Jul 07 '16 02:07 lucasmation

an added benefit is that it can support complex surveys

lucasmation avatar Jul 07 '16 02:07 lucasmation

ops, I had forgotten the video link for the talk: https://channel9.msdn.com/Events/useR-international-R-User-conference/useR2016/Efficient-tabular-data-ingestion-and-manipulation-with-MonetDBLite

I also asked for advice on this on r-package-devel: https://stat.ethz.ch/pipermail/r-package-devel/2016q3/000912.html

lucasmation avatar Jul 11 '16 14:07 lucasmation

in my experience, fread worked better than ffdf with the alunos -Censo Ensino Superior dataset.

rsljr avatar Jul 14 '18 21:07 rsljr

in my fork, i have customized read_data to import the > 1GB censo escolar bases with ffdf. it works fine.

lnribeiro avatar Feb 04 '19 22:02 lnribeiro