mixOmics icon indicating copy to clipboard operation
mixOmics copied to clipboard

Extending `BPPARAM` usage to all `tune()` variants

Open Max-Bladen opened this issue 2 years ago • 1 comments

After some initial exploration as part of rectifying problems noted in Issue #214, the inconsistency of the BiocParallel usage across all the tune() variants was noticed. In my exploration of this wider issue, I have come to the follow conclusions:

  • tune
    • this is the wrapper function which can engage the usage of all below function, save for tune.block.splsda for some reason. tune() still takes cpus. It handles the subsequent functions as such:
      • tune.mint.splsda() - not passed any parallelisation parameters due to the forced use of LOOCV. Hence, it cannot be parallelised. Nothing to change here
        • if called directly: totally fine due to LOOCV
      • tune.rcc() - not passed any parallelisation parameters due to the lack of any implementation of parallelisation within this function. Once tune.rcc() can handle multiple CPU usage, this should be adjusted. Therefore, this will be left for now
        • if called directly: cannot use any parallelisation
      • tune.pca() - not passed any parallelisation parameters due to this function not involving any repeated cross validation. It merely calculates the explained variance per component. Nothing to change here
        • if called directly: cannot use any parallelisation
      • tune.spca() - not passed any parallelisation parameters despite the function having BiocParallel implementation. Need changes here
        • if called directly: must use BPPARAM and not cpus
      • tune.splsda() - passed cpus directly from input. Need changes here
        • if called directly: uses cpus and has no BiocParallel implementation. Uses its own methodology, involving makePSOCKcluster()/makeForkCluster(). Need to do more exploration here
      • tune.spls() - prior to function call, BPPARAM object generated using cpus to control workers count. This BiocParallel object is passed to the function Need changes here
        • if called directly: must use BPPARAM and not cpus
      • tune.splslevel() - not passed any parallelisation parameters due to a lack of parallelisation implementation. Probably not worth spending time on, but if so then need changes here*
        • if called directly: cannot use any parallelisation
  • tune.block.splsda
    • this is not wrapped by tune() for some reason. It does take BPPARAM rather than cpus and applies it properly

These conclusions were drawn from looking at the source code as well as running these function while observing any errors (eg unused argument (BPPARAM = param)) AND runtime when using 1 vs 14 cores.

Note to self: when creating a new branch from this issue, fork it from branch issue-214 to carry over changes to tune.spls()

Max-Bladen avatar May 11 '22 01:05 Max-Bladen

@Max-Bladen any updates, would love to use this feature

plopez842 avatar Feb 19 '24 00:02 plopez842