auto-sklearn icon indicating copy to clipboard operation
auto-sklearn copied to clipboard

[Question] Modify Stopping Criterion to Accuracy

Open DarkEol opened this issue 3 years ago • 3 comments

Hi,

I have seen that stopping criterion can be changed to number of evaluations using smac_scenario_args 'runcount_limit' param (451) or limited by cost value using callback function (Early stopping and Callbacks). But is there a way to set accuracy as a stopping criterion? So that after some level of accuracy was reached stop the search. I see that there is "termination_cost_threshold" parameter in SMAC, can I use it from Auto-Sklearn? Or maybe there is another way to to set accuracy as stopping criterion?

DarkEol avatar Nov 27 '22 18:11 DarkEol

Hi @DarkEol,

The accuracy can be accessed through the RunValue in the callback as shown in the example. You would terminate the evaluations in much the same way. However this is just for a single model and not the final ensemble. I.e. the ensemble's performance on your test set or even the validation set may be slightly better or worse.

The termination_cost_threshold is part of the "new" smac which we are slowly trying to port over to but sadly this might take a while as it requires ripping out and putting back in a lot of the core part of Auto-sklearn.

@aron-barm Might have more to say on this

eddiebergman avatar Nov 28 '22 08:11 eddiebergman

@eddiebergman thank you for the response! I see that RunValue has such attributes as cost and time but accuracy is something different and there is no attribute accuracy. How can I access it from the parameter RunValue?

DarkEol avatar Nov 28 '22 17:11 DarkEol

SMAC (from which RunValue comes from) is unaware of what accuracy is and it is simply the "cost" as far as it's concerned.

Please not that this is still version 1.2 of SMAC but you can see it's definition here:

https://github.com/automl/SMAC3/blob/863e4290054847ba2688521b8cc2e44c15a1493a/smac/runhistory/runhistory.py#L91-L93

tldr; If you used accuracy as the optimization metric, (The metric argument when constructing the autosklearn estimator) then the "cost" will be 1 - acc. This is also the default for binary and multiclass classification problems.

This means accuracy = 1 - run_value.cost

eddiebergman avatar Nov 29 '22 13:11 eddiebergman