pdp Add a sample feature for ICE/c-ICE/d-ICE curves

Add a sample feature for ICE/c-ICE/d-ICE curves

Open bgreenwell opened this issue 6 years ago • 4 comments

For example, to plot a random (sub)sample of curves

partial(fit, pred.var = "x3", ice = TRUE, frac = 0.5, plot = TRUE)

This would be easiest to accomplish before converting to long format; for example

if (frac < 1) {
  pd.df <- pd.df[sample(nrow(pd.df), size = floor(frac*nrow(pd.df)), replace = FALSE), ]
}

May 30 '18 18:05 bgreenwell

This is exactly what I came here to ask about! I assume this feature isn't yet implemented? Until the feature is implemented, what is the "right" way to go about hacking this together?

I don't want to just restrict the sample of curves plotted. Is there a way to restrict the sample of curves computed (in addition to plotted), so as to reduce computation time. My dataset has 2.5 million observations, so even with parallel = TRUE, it's taking hours to compute and plot a single feature.

I was thinking of just feeding my random forest model a random subset of the data, and inputting that into the partial command, but I'm worried this is not correct.

Jul 03 '19 19:07 DeFilippis

Hey @DeFilippis. the easiest way to accomplish this right now is to provide a sampled version of the original training data via the ‘train’ argument in partial. Fit your model on the full training set though! I can provide a simple example later on if you need!

Now that vip has been updated on cran, I’ve started to work on pdp so hopefully these features will be available in the next release!

Jul 03 '19 19:07 bgreenwell

Perfect -- that's really easy. I'm using this in case it helps anybody:

 partial(model, pred.var = "predictor", ice = TRUE, center = TRUE, plot = TRUE, plot.engine =
"ggplot2", parallel = TRUE,  paropts = list(.packages = "ranger"), train = sample_frac(data, .5)))

sample_frac from tidyverse

Jul 03 '19 23:07 DeFilippis

That should do it! I’ll be sure to include this feature in the next release, so hopefully soon! Same with the squash function as well!

Jul 04 '19 01:07 bgreenwell

pdp pdp copied to clipboard

Add a sample feature for ICE/c-ICE/d-ICE curves

pdp
pdp copied to clipboard