twidlr icon indicating copy to clipboard operation
twidlr copied to clipboard

Making sure that `predict` always returns a vector or data.frame

Open drsimonj opened this issue 7 years ago • 3 comments

predict should always return a set of predicted values that match the data. twidlr can support various tidy functionality by always ensuring that the returned result is a vector or data.frame.

Vector would be returned for a single set of predictions, and a data.frame would be returned for multiple sets of predictions like those generated by naiveBayes or glmnet.

This would support functions similar to modelr's add_predictions, but using twidlr::predict rather than stats::predict, and passing ....

drsimonj avatar Jun 05 '17 22:06 drsimonj

This would likely involve a generic function that checks predict output and coerces results to expected output. Eg checks for things like matrices, or data.frames with single values, etc.

drsimonj avatar Jun 05 '17 22:06 drsimonj

In the case of a single set of predictions, must consider whether a vector is preferred to a data.frame of one column. By having vector or data.frame, this could lead to confusion, as the output structure is not always the same. On the other hand, having a data.frame of one column is inconvenient in many cases (eg in dplyr::mutate statements).

drsimonj avatar Jun 05 '17 22:06 drsimonj

After posting question on Twitter, a few notes came up about cases where this may not apply:

drsimonj avatar Jun 06 '17 02:06 drsimonj