DESC
DESC copied to clipboard
Rescale optimizer termination criteria
Most of our optimizers have 3 main exit criteria: $$||\Delta f || / ||f|| < ftol$$ $$||\Delta x || / ||x|| < xtol$$ $$||\nabla f ||_\infty < gtol$$
The normalizations in the objective functions generally ensure that the first one (ftol) is more or less independent of scaling/units etc, however xtol and gtol still have units of x or 1/x respectively, meaning that rescaling variables can affect the stopping criteria. This is especially noticeable when doing coil optimization or optimizing a current profile, since current is ~1e6x larger than the other degrees of freedom, it can mean exiting early once the current reaches a good value but the other variables can still be far from optimal.
This PR adds an option scaled_termination (defaults to True) to all of the desc optimizers to measure the norms for xtol and gtol in the scaled norm provided by x_scale (which defaults to using an adaptive scaling based on the Jacobian or Hessian). This should make things a bit better when optimizing parameters with widely different magnitudes.
Resolves #797
| benchmark_name | dt(%) | dt(s) | t_new(s) | t_old(s) |
| -------------------------------------- | ---------------------- | ---------------------- | ---------------------- | ---------------------- |
test_build_transform_fft_midres | -0.41 +/- 4.49 | -2.50e-03 +/- 2.72e-02 | 6.04e-01 +/- 1.8e-02 | 6.07e-01 +/- 2.0e-02 |
test_build_transform_fft_highres | +0.68 +/- 1.70 | +6.53e-03 +/- 1.63e-02 | 9.68e-01 +/- 1.3e-02 | 9.61e-01 +/- 1.0e-02 |
test_equilibrium_init_lowres | -0.42 +/- 2.05 | -1.66e-02 +/- 7.98e-02 | 3.88e+00 +/- 3.3e-02 | 3.90e+00 +/- 7.3e-02 |
test_objective_compile_atf | +0.39 +/- 4.85 | +3.07e-02 +/- 3.84e-01 | 7.94e+00 +/- 2.7e-01 | 7.91e+00 +/- 2.7e-01 |
test_objective_compute_atf | +0.01 +/- 4.75 | +8.98e-07 +/- 5.00e-04 | 1.05e-02 +/- 2.1e-04 | 1.05e-02 +/- 4.5e-04 |
test_objective_jac_atf | -0.37 +/- 3.62 | -7.21e-03 +/- 7.00e-02 | 1.93e+00 +/- 6.0e-02 | 1.93e+00 +/- 3.6e-02 |
test_perturb_1 | +1.02 +/- 2.25 | +1.45e-01 +/- 3.19e-01 | 1.43e+01 +/- 1.6e-01 | 1.42e+01 +/- 2.8e-01 |
test_proximal_jac_atf | -0.04 +/- 1.05 | -3.01e-03 +/- 8.68e-02 | 8.23e+00 +/- 5.6e-02 | 8.23e+00 +/- 6.7e-02 |
test_proximal_freeb_compute | -0.81 +/- 1.16 | -1.58e-03 +/- 2.29e-03 | 1.95e-01 +/- 1.1e-03 | 1.97e-01 +/- 2.0e-03 |
test_solve_fixed_iter_compiled | +0.15 +/- 1.67 | +2.57e-02 +/- 2.82e-01 | 1.69e+01 +/- 2.6e-01 | 1.69e+01 +/- 1.1e-01 |
test_build_transform_fft_lowres | -4.31 +/- 7.62 | -2.36e-02 +/- 4.18e-02 | 5.25e-01 +/- 2.4e-02 | 5.49e-01 +/- 3.4e-02 |
test_equilibrium_init_medres | -5.74 +/- 2.14 | -2.51e-01 +/- 9.39e-02 | 4.13e+00 +/- 5.9e-02 | 4.38e+00 +/- 7.3e-02 |
test_equilibrium_init_highres | -4.48 +/- 1.84 | -2.53e-01 +/- 1.04e-01 | 5.39e+00 +/- 4.2e-02 | 5.64e+00 +/- 9.5e-02 |
test_objective_compile_dshape_current | -0.01 +/- 5.50 | -3.88e-04 +/- 2.11e-01 | 3.84e+00 +/- 2.1e-01 | 3.84e+00 +/- 3.5e-02 |
test_objective_compute_dshape_current | -0.99 +/- 1.26 | -3.62e-05 +/- 4.59e-05 | 3.62e-03 +/- 3.3e-05 | 3.65e-03 +/- 3.2e-05 |
test_objective_jac_dshape_current | +2.34 +/- 8.02 | +9.43e-04 +/- 3.23e-03 | 4.12e-02 +/- 2.8e-03 | 4.03e-02 +/- 1.5e-03 |
test_perturb_2 | +1.20 +/- 2.91 | +2.29e-01 +/- 5.55e-01 | 1.93e+01 +/- 5.1e-01 | 1.91e+01 +/- 2.2e-01 |
test_proximal_freeb_jac | +0.98 +/- 1.25 | +7.31e-02 +/- 9.31e-02 | 7.55e+00 +/- 7.9e-02 | 7.48e+00 +/- 4.9e-02 |
test_solve_fixed_iter | +2.91 +/- 2.30 | +8.28e-01 +/- 6.55e-01 | 2.93e+01 +/- 6.2e-01 | 2.85e+01 +/- 2.2e-01 |
test_LinearConstraintProjection_build | +1.14 +/- 1.87 | +2.59e-01 +/- 4.26e-01 | 2.30e+01 +/- 3.5e-01 | 2.27e+01 +/- 2.4e-01 |
A suggestion for the intended functionality, can we just take the ratio step_ratio=step/x and check if any element of step_ratio is smaller then some new tolerance rtol?
Codecov Report
Attention: Patch coverage is 97.72727% with 1 line in your changes missing coverage. Please review.
Project coverage is 95.60%. Comparing base (
17d4939) to head (1d121c7). Report is 1598 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| desc/optimize/aug_lagrangian_ls.py | 93.75% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #1073 +/- ##
==========================================
+ Coverage 95.58% 95.60% +0.01%
==========================================
Files 96 96
Lines 24597 24601 +4
==========================================
+ Hits 23512 23519 +7
+ Misses 1085 1082 -3
| Files with missing lines | Coverage Δ | |
|---|---|---|
| desc/optimize/aug_lagrangian.py | 96.95% <100.00%> (+1.32%) |
:arrow_up: |
| desc/optimize/fmin_scalar.py | 98.08% <100.00%> (+0.01%) |
:arrow_up: |
| desc/optimize/least_squares.py | 99.33% <100.00%> (+<0.01%) |
:arrow_up: |
| desc/optimize/aug_lagrangian_ls.py | 95.69% <93.75%> (+0.02%) |
:arrow_up: |
The advanced optimization notebook has some differing results now (probably due to differing termination events of the eq subproblem with this branch vs master, at least for the proximal case), the auglag result is slightly worse and the QS results change slightly too. I think it would be good to run all the notebooks again (or at least some of them, or make an issue/other branches that runs them and updates them) so that what we have up on the docs is what actually a user would see when running the notebook.
Should we merge this before new release, or you want to wait until #1406 ?
Should we merge this before new release, or you want to wait until #1406 ?
I think we can wait till the next one.