twidlr
twidlr copied to clipboard
Making sure that `predict` always returns a vector or data.frame
predict
should always return a set of predicted values that match the data. twidlr can support various tidy functionality by always ensuring that the returned result is a vector or data.frame.
Vector would be returned for a single set of predictions, and a data.frame would be returned for multiple sets of predictions like those generated by naiveBayes
or glmnet
.
This would support functions similar to modelr's add_predictions
, but using twidlr::predict
rather than stats::predict
, and passing ...
.
This would likely involve a generic function that checks predict output and coerces results to expected output. Eg checks for things like matrices, or data.frames with single values, etc.
In the case of a single set of predictions, must consider whether a vector is preferred to a data.frame of one column. By having vector or data.frame, this could lead to confusion, as the output structure is not always the same. On the other hand, having a data.frame of one column is inconvenient in many cases (eg in dplyr::mutate
statements).
After posting question on Twitter, a few notes came up about cases where this may not apply:
- Spatial models. See this and this.
- Predicted counts in RxCx
- MCMC output
-
predict.gam(type = "lpmatrix")