mlr3book Showcase bigger benchmarks on HPC / Multicore Systems

mlr3 offers great flexibility for writing down and executing even bigger benchmarks on HPC systems through future and future.batchtools.

This is as easy as writing the code below, but we need a good template / example on how to do this.

Additionally, future seems to expose several parameters that have to be correctly set (?) for things to work more reliably.

library("mlr3")
library("batchtools")
library("future.batchtools")

plan(batchtools_slurm, template = "~/slurm.tmpl")

design = benchmark_grid(
  tasks = tsk("iris"),
  learners = list(lrn("classif.rpart"), lrn("classif.featureless")),
  resamplings = rsmp("cv")
)
benchmark(design)

General comments:

# insure agains segfaults.
learner$encapsulate

Things I would like to see:

Some intro into how mlr3 unnests the benchmark, how this works if tuning is involved.
An explanation, in which cases we have nested futures etc.
Can I restart failed jobs, how?
What happens if I see e.g. that a Learner used in the benchmark was miss-configured. Can I fix the learner and restart the jobs? Or more specifically, what is the point where I should switch back to batchtools?
Which configuration parameters of future are relevant there?

Dec 10 '19 19:12 pfistfl

Bonus question:

If I use benchmark on the following learner, what are my parallelization levels?

benchmark flattens the benchmark to a single parallelization level
tuning calls benchmark internally now?

library("mlr3")
library("batchtools")
library("future.batchtools")
plan(list(
  tweak(batchtools_slurm, template = "~/slurm.tmpl"),
  multicore
))


library(mlr3)
library(mlr3pipelines)
library(mlr3tuning)
library(mlr3learners)
resampling = rsmp("cv", folds = 3)
measure = msr("classif.ce")
tuner = tnr("grid_search", resolution = 10)
terminator = trm("evals", n_evals = 10)

lrn = GraphLearner$new(
  {
    mod = po("scale") %>>%
      po("encode") %>>%
      po(lrn("classif.xgboost", predict_type = "prob", nthread = 1L))
    mod$keep_results = TRUE
    tune_ps = paradox::ParamSet$new(list(
      paradox::ParamDbl$new("classif.xgboost.nrounds", lower = 1, upper = log(100))))
    tune_ps$trafo = function(x, param_set) {
      x$classif.xgboost.nrounds = round(exp(x$classif.xgboost.nrounds))
      return(x)
    }
    mod = GraphLearner$new(mod)
    AutoTuner$new(mod, resampling, measure, tune_ps, terminator, tuner)
  }
)

benchmark(benchmark_grid(tsk("iris"), lrn, resampling))

More specifically: Do I use plan(list(multicore, multicore)) or something else?

Dec 12 '19 13:12 pfistfl

@mllg I guess you might be the best person to answer this!

Aug 06 '20 15:08 pfistfl

I believe this is now covered in 6.1 and 6.3 - unsure if even more technical detail is required in the book and might be better fit for the gallery. Thoughts @pfistfl and @mllg ?

Oct 19 '22 14:10 RaphaelS1

Closing for now, let me know if you want to reopen/add @mllg

Jan 08 '23 16:01 RaphaelS1

mlr3book mlr3book copied to clipboard

Showcase bigger benchmarks on HPC / Multicore Systems

mlr3book
mlr3book copied to clipboard