mlr3torch
mlr3torch copied to clipboard
simple use case: rf vs torch: no factors, no missing values
Goal: Compare random forests with a simple multi layer perceptron in a simple benchmark experiment.
- We need to define the tasks:
Use three simple small tasks from OpenML: https://mlr3book.mlr-org.com/chapters/chapter11/large-scale_benchmarking.html. Simple means no missing values, also no factor variables, i.e. only numeric features. Use only classification tasks
- We need to define the learners:
- use a
classif.ranger(no hyperparameter tuning) - use a
classif.mlpwith hyperparameter tuning: For that we need to wrap theclassif.mlplearner in anAutoTuner. You need to define an:- optimization strategy: grid search
- search-space: Tune over neurons, batch_size, dropout rate and epochs (possibly tune the epochs using early stopping: https://github.com/mlr-org/mlr3book/pull/829/files#diff-34e37e3d914ffb3452f3aea4f890a134800d94534232de9c179dfc1d2706f137 this is not necessary). Probably okay to run these small networks on cpu.
- tuning measure: some standard classiifcation measure such as accuracy
- resampling: k-fold CV
- number of evaluations for tuning (the 'terminator'): set this small enough so that it runs in a reasonable amount of time.
- We need to define the resampling strategy: Also just some cross-validation
And also we need to parallelize the experiment executing using the future package: https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html#sec-parallel-learner
Some example:
library(mlr3torch)
library(mlr3tuning)
library(mlr3learners)
learner = lrn("classif.mlp",
# define the tuning space via the to_tune() tokens
# use either 16, 32, or 64
batch_size = to_tune(c(16, 32, 64)),
# tune the dropout probability in the interval [0.1, 0.9]
p = to_tune(0.1, 0.9),
# tune the epochs using early stopping (internal = TRUE)
epochs = to_tune(upper = 1000L, internal = TRUE),
# configure the early-stopping / validation
validate = 0.3,
measures_valid = msr("classif.acc"),
patience = 10,
device = "cpu"
)
at = auto_tuner(
learner = learner,
tuner = tnr("grid_search"),
resampling = rsmp("cv"),
measure = msr("classif.acc"),
term_evals = 10
)
task = tsk("iris")
at$train(task)
future::plan("multisesssion")
design = benchmark_grid(
tsk("iris"),
learners = list(at, lrn("classif.ranger")),
resampling = rsmp("cv", folds = 10)
)
benchmark(design)
# parallelize the outer resampling, not the inner resampling
# 1. apply learner at to fold 1 of iris (outer)
# 2. apply learner at to fold 2 of iris (outer)
# the autotuner itself also can parallelize execution (inner)
# ...
# 10. apply learner at to fold 10 of iris (outer)
# 11. apply learner ranger to fold 1 of iris (outer)
# ..
# 20. apply learner ranger to fold 10 of iris (outer)