bigrf icon indicating copy to clipboard operation
bigrf copied to clipboard

Memory Leak when calling predict inside of RStudio

Open abelsonlive opened this issue 8 years ago • 6 comments

I've had a persistent bug when using predict on a bigrf model inside of RStudio. Essentially, there seems to be a memory leak which leads to RStudio sucking up all of my machine's RAM and forcing me to shutdown my computer. Curiously this does not happen when I run my script from the command line using Rscript.

abelsonlive avatar Sep 17 '15 18:09 abelsonlive

screenshot 2015-09-17 14 26 02

abelsonlive avatar Sep 17 '15 18:09 abelsonlive

Are you running predict in parallel? Can you share a code snippet?

aloysius-lim avatar Sep 17 '15 23:09 aloysius-lim

No I'm not running it in parallel. It's a pretty straightforward implementation. While it's hard to share the exact code snippet as it's been abstracted out into separate functions, its basically this:

require(bigrf)

samp <- sample(1:nrow(iris), nrow(iris) * .6)
train <- iris[samp, ]
test <- iris[-samp,]

m <- bigrfc(train, 
       train$Species, 
       ntree=10, 
       varselect=1:4,
       trace=1)
p <- predict(m, test)

The test set is ~ 2 GB and I'm running it on a machine with 16 GB of ram.

abelsonlive avatar Sep 24 '15 05:09 abelsonlive

You can see the repository here: https://github.com/enigma-io/smoke-alarm-risk. The functions in question are here: https://github.com/enigma-io/smoke-alarm-risk/blob/master/rscripts/model.R

abelsonlive avatar Sep 24 '15 05:09 abelsonlive

How many core processors does your computer have? How long is it taking you to train the 2GB data?

abiyug avatar Dec 03 '15 16:12 abiyug

I have the same issue, with basically the same code as @abelsonlive.

Dataset is 300MB on a 6GB machine, 4 core machine, 50 trees. Occurs with and without parallel.

Using R 3.2.3 on Fedora 23.

ajnisbet avatar Apr 17 '16 09:04 ajnisbet