mlr3tuning Avoid duplicated results in tuning instance?

Not sure if the code below should produce at least a warning that the learner was already evaluated with the same parameters:

library(mlr3)
library(mlr3learners)
library(mlr3tuning)
task = tsk("sonar")
learner = lrn("classif.kknn", predict_type = "prob")
learner$param_set
tune_ps = ParamSet$new(list(
  ParamInt$new("k", lower = 1, upper = 2)
))

instance = TuningInstance$new(
  task = task,
  learner = learner,
  resampling = rsmp("holdout"),
  measures = msr("classif.auc"),
  param_set = tune_ps,
  terminator = term("none")
)

set.seed(1)
tuner_grid = tnr("grid_search", resolution = 2)
tuner_grid$tune(instance)
tuner_grid$tune(instance) # causes duplicated results if the user runs this line multiple times "accidentally"

perfdata = instance$archive("params")
perfdata[, c("nr", "k", "classif.auc")]

   nr k classif.auc
1:  1 1   0.7978992
2:  2 2   0.8735294
3:  3 1   0.7978992
4:  4 2   0.8735294

If the learner is stochastic such as ranger, something like this could happen:

   nr mtry classif.ce
1:  1    1  0.1884058
2:  2    2  0.1594203
3:  3    1  0.1739130
4:  4    2  0.1884058

Storing results of hyperparameter combinations that were already evaluated should maybe avoided?

Nov 01 '19 17:11 giuseppec

thats nearly the same issue as #127

Nov 01 '19 17:11 berndbischl

currently we see it this way: if that happens thats an aspect of your tuner / the size of your search space. i dont think its very easy to transparently handle this. what should we do?

i cannot easily change the tuning algo so that this does not happen
i cannot "remove" the proposed point
i could directly return the already stored value, but -- what does this mean for terminators? -- as you already said: if the algo is stochastic, the result might even be different

i can warn about this. but i dislike warnings in general somewhat. here, it might fine.

@mllg ?

Nov 01 '19 17:11 berndbischl

@giuseppec also your use-case seems very weird? you run twice on the same instance?

Nov 01 '19 17:11 berndbischl

Yeah, it's not a real "use case". I just stumbled over this "issue" as I accidentally run the line tuner_grid$tune(instance) twice, one after another, and then I wondered about the duplicated results. Maybe this is just "own stupidity" when it happens, but I wanted to mention it here, because I wasn't sure if it has any other implications.

Nov 01 '19 17:11 giuseppec

Yeah, it's not a real "use case". I just stumbled over this "issue" as I accidentally run the line tuner_grid$tune(instance) twice, one after another, and then I wondered about the duplicated results.

thats abolutely fine. and good to report such stuff then. i also opened this #204

Nov 01 '19 17:11 berndbischl

I think it is perfectly fine that the code runs like that. A tuner ist told to run on an instance and we specifically allow it(?) to not be empty. If the tuner is not incorporating the archive it is fine (e.g. random search). If the tuner is stupid and does always the same (e.g. grid search) it's the users problem. Also the user could change e.g. inst$learner$param_set$values$distance (in your example) and run the tuner on again. This would also totally be valid.

Consequently a warning should only be issued if it is 100% necessary.

Nov 07 '19 08:11 jakob-r

agreeing with @jakob-r. although i guess i COULD see the case for a warning. OTOH they tend to get annoying. can we get a quick vote if you want to see a warning in the general case that a configuration is evaluated multiple times?

@jakob-r @mllg @mb706 @giuseppec @larskotthoff @pfistfl

(Edit by @jakob-r: Voting please with :+1: and :-1: )

Nov 07 '19 08:11 berndbischl

Do tuners aggregate the configurations that were evaluated multiple times, or do they just report the best one? Both behaviours would have their problems. The eval_batch() call could also refuse to evaluate the configuration a second time and just return the previous result, so that "dumb" algos like grid search and random search (with discrete search space) don't trip over this. Explicitly multi-fidelity algos like hyperband and MBO would then have to set an extra flag in the eval_batch() call to get around this.

Dec 17 '19 10:12 mb706

currently mlr3tuning does not talk / worry about "repeated" evals. every evlaution is treated as "different", although it might not be.

if we want to handle this, a proposal must be written down carefully first

Dec 17 '19 11:12 berndbischl

Proposal:

TuningInstance$eval_batch() gets argument reevaluate default FALSE. If reevaluate is FALSE, then the configurations that are already in self$bmr are not evaluated again; instead their performance from previous runs is used and returned as perf. If reevaluate is TRUE then behaviour is as it is right now (and without warning message). Random search and grid search use reevaluate = FALSE, some other algorithms (e.g. irace) may have reevaluate = TRUE. Algorithms that have reevalaute = TRUE need to take special care what performance they report in assign_result().

Dec 17 '19 12:12 mb706

Would it be better if reevaluate was a property of the TuningInstance instead of an argument that has to be passed all the time? It could be an active binding that is set to TRUE if the learner is deterministic and to FALSE if the learner is stochastic. Probably the user should be able to override this behavior. Then the tuner can adapt its behavior accordingly.

Dec 18 '19 11:12 jakob-r

This is something that the tuning algorithm should decide, not the user (Although possibly the user may set an argument in the tuning algorithm that changes this). Having information about whether the learning alg is stochastic would still be a good idea.

Dec 18 '19 11:12 mb706

This is something that the tuning algorithm should decide, not the user (Although possibly the user may set an argument in the tuning algorithm that changes this). Having information about whether the learning alg is stochastic would still be a good idea.

irace has an option deterministic to specify if running the same configuration on the same instance will produce the same value. If deterministic=true then irace never attempts to do that. If deterministic=false, then irace makes sure to vary the seed passed when re-evaluating so that the same configuration is never evaluated on the same instance-seed pair.

I would consider deterministic a setting of the scenario not of the tuner (the tuner may use it or ignore it).

Jun 23 '22 12:06 MLopez-Ibanez

We decided that the tuner should handle this.

Aug 15 '24 13:08 be-marc

mlr3tuning mlr3tuning copied to clipboard

Avoid duplicated results in tuning instance?

mlr3tuning
mlr3tuning copied to clipboard