kknn icon indicating copy to clipboard operation
kknn copied to clipboard

predict.train.kknn() does not respect all parameters from train.kknn()

Open schoonees opened this issue 4 years ago • 1 comments

predict.train.kknn() does not respect all parameters passed to train.kknn(). An example is scale.

For example, predicting with scale = FALSE and scale = TRUE with train.kknn() give the same results:

library(tidymodels)
data("mtcars")
set.seed(1)
mtcars_split <- initial_split(mtcars, prop = 0.7)

## scale = FALSE
kknn::train.kknn(formula = mpg ~ disp + wt, data = training(mtcars_split), 
                 ks = 5, scale = FALSE) %>% 
  predict(testing(mtcars_split))
#> [1] 21.032 21.784 16.668 16.052 21.264 16.404 26.340 16.076 15.620

## scale = TRUE
kknn::train.kknn(formula = mpg ~ disp + wt, data = training(mtcars_split), 
                 ks = 5, scale = TRUE) %>% 
  predict(testing(mtcars_split))
#> [1] 21.032 21.784 16.668 16.052 21.264 16.404 26.340 16.076 15.620

But kknn() correctly shows a slight difference:

## scale = FALSE
kknn::kknn(formula = mpg ~ disp + wt, train = training(mtcars_split), 
           test = testing(mtcars_split), k = 5, scale = FALSE) %>% 
  predict(newdata = testing(mtcars_split))
#> [1] 21.276 21.276 16.860 16.276 21.276 16.404 29.680 15.700 16.020

## scale = TRUE
kknn::kknn(formula = mpg ~ disp + wt, train = training(mtcars_split), 
           test = testing(mtcars_split), k = 5, scale = TRUE) %>% 
  predict(newdata = testing(mtcars_split))
#> [1] 21.032 21.784 16.668 16.052 21.264 16.404 26.340 16.076 15.620

The issue is that kknn::predict.train.kknn() only respects some of the parameters originally passed to train.kknn(), but not all. scale, na.action, ykernel and contrasts aren't passed along to kknn() inside kknn::predict.train.kknn().

A fix would involve parsing the $call entry of the train.kknn-object more carefully.

See also this SO question.

schoonees avatar Nov 19 '20 14:11 schoonees

Any thoughts on this?

schoonees avatar Feb 25 '21 16:02 schoonees