iml
iml copied to clipboard
Object storage sizes
I have been working with an interaction forest (intaus
) on 2000 observations that consists of the default 20000 trees (from package diversityForest). This forest uses 306 682 KB disk space. I have applied Interaction$new
and FeatureEffects$new
to that forest and stored the resulting objects on disk (R work spaces with a single object each). I end up with the following stored object sizes:
hilf <- Predictor$new(intaus, data=as.data.frame(yx2[,-1]), y=yx2[,1], predict.function=predfun)
hilf2 <- Interaction$new(hilf)
## storage size is 1 227 232 KB
fes <- FeatureEffects$new(hilf)
## storage size is 1 248 226 KB
To me, these sizes appear excessive. I wonder what functionalities of these objects I might miss that justify these huge object sizes. Or would it perhaps be possible for Interaction$new
and FeatureEffects$new
to return smaller objects without sacrificing functionality?
Best, Ulrike
What is your use case for storing these objects?
One reason for the size is that the Predictor is part of Interaction / FeatureEffects. But it seems not completely explanatory for the size, maybe it is stored more than once.
The use case is that I don't want to invest the run time again, and want to have them available later e.g. for plotting or printing in comparison to other numbers calculated elsewhere.
I have not tried it yet, but you could try setting the predictor to NULL:
interaction_object$predictor = NULL
This should make the object a lot smaller. The results are stored in a data.frame in $results and the plotting should not be affected by it either. It's a hacky solution, so I can't guarantee it works right away
Thank you for the proposal. After setting the $predictor
to NULL
, the file size was only 252 kB. The plot method still works, the print method doesn't (but I can of course access the $results
nevertheless).
I think that it would be highly desirable that output objects for interactions and feature effects are more parsimonious per default (green ML!).
By the way, from within R I found it quite difficult to assess object sizes. object.size(hilf2)
returned size 448 Bytes
(!) for the huge object. That size remains unchanged after removing $predictor
.