CAST
CAST copied to clipboard
Account for tibble (non-)drop behavior in aoa
First, thanks for your work on CAST
. It is a very nice package and I am looking forward to further developments.
I recently ran into an issue while trying to run the tutorial https://cran.r-project.org/web/packages/CAST/vignettes/AOA-tutorial.html with my own data. I ran the function aoa
, but the AOA$AOA results were only zeros.
AOA <- aoa(newdata = newdata, model = mod1, returnTrainDI = TRUE, cl = cl)
I found the issue was that I am using a tibble when training the model as below:
mod1 <- train(x = mytbl[,predictorNames],
y = mytbl$response,
method = "rf",
importance = TRUE,
tuneGrid = expand.grid(mtry = c(2:length(predictorNames))),
trControl = trainControl(method = "cv", savePredictions = TRUE))
Because of that, model$trainingData
is also a tibble, and on line 168, newdata[,catvar]
becomes NA, because I have one categorical predictor. tibble
has a different dropping behavior than data.frame
when a single column is returned. Specifically, unique(train[,catvar])
return a one-column tibble instead of a vector.
https://github.com/HannaMeyer/CAST/blob/b34bc3526226b9a9bee5111d684d68dcf07d0432/R/aoa.R#L168
The solution for me was to use mytbl <- as.data.frame(mytbl)
before training the model, but I would suggest to use this at the beginning of the aoa function call to increase robustness to handle tibbles as well:
if(is.null(train)){train <- as.data.frame(model$trainingData)}
I don't have a ready reprex but I hope my description is sufficient to understand the issue.