benchopt
benchopt copied to clipboard
Unexpected stop because of check_convergence in stopping_criterion
When running benchmarks with multiple solvers and the params max_runs set, some solvers will end up producing a much lower number of iterates than max_runs because check_convergence
makes them exit before their time. This might not be intended behavior for some users. Especially for DL people using mini-batch SGD like solvers where it often happens that the objective (test error) starts rising again before going down because of non-convexity (finding local minima then "escaping" making objective rise to find a better one afterwards) , variance or double-descent phenomenon
A couple of things that would help:
- make it clearer that it stopped because of
check_convergence
in the logs - allow users to disable this check
- implement more robust logic
- explain more about what this check is doing in the docs (notably the role of patience and tolerance (EPS) and their default values)
Yes, I agree that this is a confusing part of benchopt
, I would be happy to find a way to make it clearer/simpler.
The users can already change the way the checks are done by setting a class attribute stopping_criterion
, as described in this part of the doc. We rework this not long ago, but I think this is still confusing as:
- It is not clear what is the
Solver.stopping_criterion
, and how to overwrite it. - The default values are not described.
- The default value to
SufficientProgressCriterion(patience=3, eps=1e-10)
is maybe not suited.
This choices also needs to take into account the selection of the proper sampling strategy, which is log scale for now but could be changed to linear by default. I would be happy to have your thoughts on this one.
I only found out about this part of the doc from your comment and actually had to "debug" this behavior by diving deep into the code. Maybe adding a simple example on how to override this stopping criterion using the least amount of code would help a lot new users. Indeed the interaction between this stopping criterion and the sampling rate should be emphasized. Regarding the default frankly I have opinions but again they would be nice for DL people maybe not for the entire optimization community.