mixOmics
mixOmics copied to clipboard
t.test.process choice ignores bounded nature of performance criteria
many times users have complained that the chosen keepX is too strict. Here's an example for tune.spca
for component 2:
suppressMessages(library(mixOmics))
data(multidrug)
set.seed(2341)
tune.spca_res1 <- tune.spca(ncomp = 2, X = multidrug$ABC.trans, nrepeat = 5, folds = 3, test.keepX = seq(5,15,5))
#> KeepX = 5
#> KeepX = 10
#> KeepX = 15
#> KeepX = 5
#> KeepX = 10
#> KeepX = 15
plot(tune.spca_res1)
Created on 2020-10-02 by the reprex package (v0.3.0)
Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.2 (2020-06-22)
#> os macOS Catalina 10.15
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_AU.UTF-8
#> ctype en_AU.UTF-8
#> tz Australia/Melbourne
#> date 2020-10-02
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0)
#> backports 1.1.10 2020-09-15 [1] CRAN (R 4.0.2)
#> BiocParallel 1.23.2 2020-07-06 [1] Bioconductor
#> callr 3.4.4 2020-09-07 [1] CRAN (R 4.0.2)
#> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0)
#> colorspace 1.4-1 2019-03-18 [1] CRAN (R 4.0.0)
#> corpcor 1.6.9 2017-04-01 [1] CRAN (R 4.0.0)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0)
#> curl 4.3 2019-12-02 [1] CRAN (R 4.0.0)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0)
#> devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.2)
#> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0)
#> dplyr 1.0.2 2020-08-18 [1] CRAN (R 4.0.2)
#> ellipse 0.4.2 2020-05-27 [1] CRAN (R 4.0.2)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0)
#> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0)
#> farver 2.0.3 2020-01-16 [1] CRAN (R 4.0.0)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.0)
#> ggplot2 * 3.3.2 2020-06-19 [1] CRAN (R 4.0.2)
#> ggrepel 0.8.2 2020-03-08 [1] CRAN (R 4.0.0)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
#> gridExtra 2.3 2017-09-09 [1] CRAN (R 4.0.0)
#> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.0)
#> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0)
#> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2)
#> httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2)
#> igraph 1.2.5 2020-03-19 [1] CRAN (R 4.0.0)
#> knitr 1.30 2020-09-22 [1] CRAN (R 4.0.2)
#> labeling 0.3 2014-08-23 [1] CRAN (R 4.0.0)
#> lattice * 0.20-41 2020-04-02 [1] CRAN (R 4.0.2)
#> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.2)
#> MASS * 7.3-53 2020-09-09 [1] CRAN (R 4.0.2)
#> Matrix 1.2-18 2019-11-27 [1] CRAN (R 4.0.2)
#> matrixStats 0.57.0 2020-09-25 [1] CRAN (R 4.0.2)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0)
#> mime 0.9 2020-02-04 [1] CRAN (R 4.0.0)
#> mixOmics * 6.13.67 2017-02-06 [1] Bioconductor (R 4.0.2)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.0)
#> pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.2)
#> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2)
#> plyr 1.8.6 2020-03-03 [1] CRAN (R 4.0.0)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0)
#> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2)
#> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0)
#> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0)
#> rARPACK 0.11-0 2016-03-10 [1] CRAN (R 4.0.0)
#> RColorBrewer 1.1-2 2014-12-07 [1] CRAN (R 4.0.0)
#> Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.2)
#> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2)
#> reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.0.0)
#> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2)
#> rmarkdown 2.4 2020-09-30 [1] CRAN (R 4.0.2)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0)
#> RSpectra 0.16-0 2019-12-01 [1] CRAN (R 4.0.0)
#> scales 1.1.1 2020-05-11 [1] CRAN (R 4.0.0)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0)
#> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0)
#> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0)
#> tibble 3.0.3 2020-07-10 [1] CRAN (R 4.0.2)
#> tidyr 1.1.2 2020-08-27 [1] CRAN (R 4.0.2)
#> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.0)
#> usethis 1.6.3 2020-09-17 [1] CRAN (R 4.0.2)
#> vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.2)
#> withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.2)
#> xfun 0.18 2020-09-29 [1] CRAN (R 4.0.2)
#> xml2 1.3.2 2020-04-23 [1] CRAN (R 4.0.0)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library
Initially, we thought it's a threshold issue, but that's not always the case as we get close to 100%.
possible culprit
We measure the significance of the improvement in error rate or correlation to decide the optimality. It ignores the bounded (truncated?) nature of the performance criteria (e.g. correlation is b/w [-1,1]).
possible solution
Adjust the measure based on a truncated distribution, (or transform current measures to take them into an unbounded space?)
Hi @aljabadi
Few questions about this one:
- Were any implementations made for this issue? I can't see any evidence of measure transformations in the
tune.xxx()
functions - Assuming no adjustments were made, what space would be appropriate to transform these too?
- How would this transformation work using different criteria (eg. BER vs cor)?