kknn icon indicating copy to clipboard operation
kknn copied to clipboard

error in predicting with predict.train.kknn

Open topepo opened this issue 7 years ago • 0 comments
trafficstars

I ran into a bug when the number of neighbors is greater than the number of samples being predicted.

It only occurs on the GH version; the CRAN version is fine (see note below)

library(kknn)

data(miete)

miete_tr <- miete[-(1:5),    ]
miete_te <- miete[  1:5 , -13]

train.con <- train.kknn(
  nmqm ~ wfl + bjkat + zh,
  data = miete_tr,
  ks = 8,
  kernel = "rectangular"
)

# Try to get the 8-nearest neighbors from the training set of 
# `nrow(miete_tr)` = 1077 households. 
predict(train.con, miete_te)
#> Error in kknn(formula(terms(object)), object$data, newdata, k = object$best.parameters$k, : k must be smaller or equal the number of rows of the training
#>                  set
sessionInfo()
#> R version 3.5.0 (2018-04-23)
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS High Sierra 10.13.6
#> 
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] kknn_1.3.2
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_0.12.19.3  lattice_0.20-35 digest_0.6.18   rprojroot_1.3-2
#>  [5] grid_3.5.0      backports_1.1.2 magrittr_1.5    evaluate_0.12  
#>  [9] stringi_1.2.4   Matrix_1.2-14   rmarkdown_1.9   tools_3.5.0    
#> [13] stringr_1.3.1   igraph_1.2.2    yaml_2.2.0      compiler_3.5.0 
#> [17] pkgconfig_2.0.2 htmltools_0.3.6 knitr_1.20

Created on 2018-10-16 by the reprex package (v0.2.1)

I think that the line

if(k>p) stop('k must be smaller or equal the number of rows of the training
                 set')

should be

if(k>m) stop('k must be smaller or equal the number of rows of the training
                 set')

(edit) - fixed suggested fix.

topepo avatar Oct 16 '18 18:10 topepo