microdadosBrasil
microdadosBrasil copied to clipboard
add support for out of memmory datasets
I think the best approach would be MonetDBlite, fast and out of memory. check this talk at useR2016 conference
an added benefit is that it can support complex surveys
ops, I had forgotten the video link for the talk: https://channel9.msdn.com/Events/useR-international-R-User-conference/useR2016/Efficient-tabular-data-ingestion-and-manipulation-with-MonetDBLite
I also asked for advice on this on r-package-devel: https://stat.ethz.ch/pipermail/r-package-devel/2016q3/000912.html
in my experience, fread worked better than ffdf with the alunos -Censo Ensino Superior dataset.
in my fork, i have customized read_data
to import the > 1GB censo escolar bases with ffdf
. it works fine.