bigrf
bigrf copied to clipboard
Memory Leak when calling predict inside of RStudio
I've had a persistent bug when using predict
on a bigrf
model inside of RStudio. Essentially, there seems to be a memory leak which leads to RStudio sucking up all of my machine's RAM and forcing me to shutdown my computer. Curiously this does not happen when I run my script from the command line using Rscript
.
Are you running predict
in parallel? Can you share a code snippet?
No I'm not running it in parallel. It's a pretty straightforward implementation. While it's hard to share the exact code snippet as it's been abstracted out into separate functions, its basically this:
require(bigrf)
samp <- sample(1:nrow(iris), nrow(iris) * .6)
train <- iris[samp, ]
test <- iris[-samp,]
m <- bigrfc(train,
train$Species,
ntree=10,
varselect=1:4,
trace=1)
p <- predict(m, test)
The test set is ~ 2 GB and I'm running it on a machine with 16 GB of ram.
You can see the repository here: https://github.com/enigma-io/smoke-alarm-risk. The functions in question are here: https://github.com/enigma-io/smoke-alarm-risk/blob/master/rscripts/model.R
How many core processors does your computer have? How long is it taking you to train the 2GB data?
I have the same issue, with basically the same code as @abelsonlive.
Dataset is 300MB on a 6GB machine, 4 core machine, 50 trees. Occurs with and without parallel.
Using R 3.2.3 on Fedora 23.