mixOmics icon indicating copy to clipboard operation
mixOmics copied to clipboard

t.test.process choice ignores bounded nature of performance criteria

Open aljabadi opened this issue 4 years ago • 1 comments

many times users have complained that the chosen keepX is too strict. Here's an example for tune.spca for component 2:

suppressMessages(library(mixOmics))
data(multidrug)
set.seed(2341)
tune.spca_res1 <- tune.spca(ncomp = 2, X = multidrug$ABC.trans, nrepeat = 5, folds = 3, test.keepX = seq(5,15,5))
#> KeepX =  5 
#> KeepX =  10 
#> KeepX =  15 
#> KeepX =  5 
#> KeepX =  10 
#> KeepX =  15
plot(tune.spca_res1) 

Created on 2020-10-02 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15        
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_AU.UTF-8                 
#>  ctype    en_AU.UTF-8                 
#>  tz       Australia/Melbourne         
#>  date     2020-10-02                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version date       lib source                
#>  assertthat     0.2.1   2019-03-21 [1] CRAN (R 4.0.0)        
#>  backports      1.1.10  2020-09-15 [1] CRAN (R 4.0.2)        
#>  BiocParallel   1.23.2  2020-07-06 [1] Bioconductor          
#>  callr          3.4.4   2020-09-07 [1] CRAN (R 4.0.2)        
#>  cli            2.0.2   2020-02-28 [1] CRAN (R 4.0.0)        
#>  colorspace     1.4-1   2019-03-18 [1] CRAN (R 4.0.0)        
#>  corpcor        1.6.9   2017-04-01 [1] CRAN (R 4.0.0)        
#>  crayon         1.3.4   2017-09-16 [1] CRAN (R 4.0.0)        
#>  curl           4.3     2019-12-02 [1] CRAN (R 4.0.0)        
#>  desc           1.2.0   2018-05-01 [1] CRAN (R 4.0.0)        
#>  devtools       2.3.2   2020-09-18 [1] CRAN (R 4.0.2)        
#>  digest         0.6.25  2020-02-23 [1] CRAN (R 4.0.0)        
#>  dplyr          1.0.2   2020-08-18 [1] CRAN (R 4.0.2)        
#>  ellipse        0.4.2   2020-05-27 [1] CRAN (R 4.0.2)        
#>  ellipsis       0.3.1   2020-05-15 [1] CRAN (R 4.0.0)        
#>  evaluate       0.14    2019-05-28 [1] CRAN (R 4.0.0)        
#>  fansi          0.4.1   2020-01-08 [1] CRAN (R 4.0.0)        
#>  farver         2.0.3   2020-01-16 [1] CRAN (R 4.0.0)        
#>  fs             1.5.0   2020-07-31 [1] CRAN (R 4.0.2)        
#>  generics       0.0.2   2018-11-29 [1] CRAN (R 4.0.0)        
#>  ggplot2      * 3.3.2   2020-06-19 [1] CRAN (R 4.0.2)        
#>  ggrepel        0.8.2   2020-03-08 [1] CRAN (R 4.0.0)        
#>  glue           1.4.2   2020-08-27 [1] CRAN (R 4.0.2)        
#>  gridExtra      2.3     2017-09-09 [1] CRAN (R 4.0.0)        
#>  gtable         0.3.0   2019-03-25 [1] CRAN (R 4.0.0)        
#>  highr          0.8     2019-03-20 [1] CRAN (R 4.0.0)        
#>  htmltools      0.5.0   2020-06-16 [1] CRAN (R 4.0.2)        
#>  httr           1.4.2   2020-07-20 [1] CRAN (R 4.0.2)        
#>  igraph         1.2.5   2020-03-19 [1] CRAN (R 4.0.0)        
#>  knitr          1.30    2020-09-22 [1] CRAN (R 4.0.2)        
#>  labeling       0.3     2014-08-23 [1] CRAN (R 4.0.0)        
#>  lattice      * 0.20-41 2020-04-02 [1] CRAN (R 4.0.2)        
#>  lifecycle      0.2.0   2020-03-06 [1] CRAN (R 4.0.0)        
#>  magrittr       1.5     2014-11-22 [1] CRAN (R 4.0.2)        
#>  MASS         * 7.3-53  2020-09-09 [1] CRAN (R 4.0.2)        
#>  Matrix         1.2-18  2019-11-27 [1] CRAN (R 4.0.2)        
#>  matrixStats    0.57.0  2020-09-25 [1] CRAN (R 4.0.2)        
#>  memoise        1.1.0   2017-04-21 [1] CRAN (R 4.0.0)        
#>  mime           0.9     2020-02-04 [1] CRAN (R 4.0.0)        
#>  mixOmics     * 6.13.67 2017-02-06 [1] Bioconductor (R 4.0.2)
#>  munsell        0.5.0   2018-06-12 [1] CRAN (R 4.0.0)        
#>  pillar         1.4.6   2020-07-10 [1] CRAN (R 4.0.2)        
#>  pkgbuild       1.1.0   2020-07-13 [1] CRAN (R 4.0.2)        
#>  pkgconfig      2.0.3   2019-09-22 [1] CRAN (R 4.0.0)        
#>  pkgload        1.1.0   2020-05-29 [1] CRAN (R 4.0.2)        
#>  plyr           1.8.6   2020-03-03 [1] CRAN (R 4.0.0)        
#>  prettyunits    1.1.1   2020-01-24 [1] CRAN (R 4.0.0)        
#>  processx       3.4.4   2020-09-03 [1] CRAN (R 4.0.2)        
#>  ps             1.3.4   2020-08-11 [1] CRAN (R 4.0.2)        
#>  purrr          0.3.4   2020-04-17 [1] CRAN (R 4.0.0)        
#>  R6             2.4.1   2019-11-12 [1] CRAN (R 4.0.0)        
#>  rARPACK        0.11-0  2016-03-10 [1] CRAN (R 4.0.0)        
#>  RColorBrewer   1.1-2   2014-12-07 [1] CRAN (R 4.0.0)        
#>  Rcpp           1.0.5   2020-07-06 [1] CRAN (R 4.0.2)        
#>  remotes        2.2.0   2020-07-21 [1] CRAN (R 4.0.2)        
#>  reshape2       1.4.4   2020-04-09 [1] CRAN (R 4.0.0)        
#>  rlang          0.4.7   2020-07-09 [1] CRAN (R 4.0.2)        
#>  rmarkdown      2.4     2020-09-30 [1] CRAN (R 4.0.2)        
#>  rprojroot      1.3-2   2018-01-03 [1] CRAN (R 4.0.0)        
#>  RSpectra       0.16-0  2019-12-01 [1] CRAN (R 4.0.0)        
#>  scales         1.1.1   2020-05-11 [1] CRAN (R 4.0.0)        
#>  sessioninfo    1.1.1   2018-11-05 [1] CRAN (R 4.0.0)        
#>  stringi        1.5.3   2020-09-09 [1] CRAN (R 4.0.2)        
#>  stringr        1.4.0   2019-02-10 [1] CRAN (R 4.0.0)        
#>  testthat       2.3.2   2020-03-02 [1] CRAN (R 4.0.0)        
#>  tibble         3.0.3   2020-07-10 [1] CRAN (R 4.0.2)        
#>  tidyr          1.1.2   2020-08-27 [1] CRAN (R 4.0.2)        
#>  tidyselect     1.1.0   2020-05-11 [1] CRAN (R 4.0.0)        
#>  usethis        1.6.3   2020-09-17 [1] CRAN (R 4.0.2)        
#>  vctrs          0.3.4   2020-08-29 [1] CRAN (R 4.0.2)        
#>  withr          2.3.0   2020-09-22 [1] CRAN (R 4.0.2)        
#>  xfun           0.18    2020-09-29 [1] CRAN (R 4.0.2)        
#>  xml2           1.3.2   2020-04-23 [1] CRAN (R 4.0.0)        
#>  yaml           2.2.1   2020-02-01 [1] CRAN (R 4.0.0)        
#> 
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

Initially, we thought it's a threshold issue, but that's not always the case as we get close to 100%.

possible culprit

We measure the significance of the improvement in error rate or correlation to decide the optimality. It ignores the bounded (truncated?) nature of the performance criteria (e.g. correlation is b/w [-1,1]).

possible solution

Adjust the measure based on a truncated distribution, (or transform current measures to take them into an unbounded space?)

aljabadi avatar Oct 02 '20 02:10 aljabadi

Hi @aljabadi

Few questions about this one:

  • Were any implementations made for this issue? I can't see any evidence of measure transformations in the tune.xxx() functions
  • Assuming no adjustments were made, what space would be appropriate to transform these too?
  • How would this transformation work using different criteria (eg. BER vs cor)?

Max-Bladen avatar Aug 30 '22 23:08 Max-Bladen