botorch icon indicating copy to clipboard operation
botorch copied to clipboard

Update tutorials to use "sample_around_best": True

Open saitcakmak opened this issue 3 years ago • 8 comments

Motivation

Using "sample_around_best": True should lead to improved optimization performance.

Have you read the Contributing Guidelines on pull requests?

Yest

Test Plan

TODO: Re-run the notebooks, so the outputs reflect the changes.

Related PRs

(If this PR adds or changes functionality, please take some time to update the docs at https://github.com/pytorch/botorch, and link to your PR here.)

cc @sdaulton

saitcakmak avatar Feb 09 '22 17:02 saitcakmak

Codecov Report

Merging #1075 (e7b6f4b) into main (40d89e1) will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##              main     #1075   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          112       112           
  Lines         9330      9330           
=========================================
  Hits          9330      9330           

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 40d89e1...e7b6f4b. Read the comment docs.

codecov[bot] avatar Feb 09 '22 17:02 codecov[bot]

are there any cases where we wouldn't want to sample_around_best? if it's so commonly useful for AF optimization why hide it in a options blob?

On Wed, Feb 9, 2022 at 9:37 AM Sait Cakmak @.***> wrote:

Motivation

Using "sample_around_best": True should lead to improved optimization performance. Have you read the Contributing Guidelines on pull requests https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests ?

Yest Test Plan

TODO: Re-run the notebooks, so the outputs reflect the changes. Related PRs

(If this PR adds or changes functionality, please take some time to update the docs at https://github.com/pytorch/botorch, and link to your PR here.)

cc @sdaulton https://github.com/sdaulton

You can view, comment on, or merge this pull request online at:

https://github.com/pytorch/botorch/pull/1075 Commit Summary

File Changes

(19 files https://github.com/pytorch/botorch/pull/1075/files)

Patch Links:

  • https://github.com/pytorch/botorch/pull/1075.patch
  • https://github.com/pytorch/botorch/pull/1075.diff

— Reply to this email directly, view it on GitHub https://github.com/pytorch/botorch/pull/1075, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAW34MQCTODPOPSZIBEHXDU2KQ6FANCNFSM5N6FSZ3A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

eytan avatar Feb 10 '22 01:02 eytan

are there any cases where we wouldn't want to sample_around_best?

Probably not? I guess that requires some additional benchmarking

if it's so commonly useful for AF optimization why hide it in a options blob?

I would like to avoid adding a long rats tail of boolean options to our interfaces, that will soon become unwieldy. If we want to enable that by default we can do that without exposing it as a separate option.

Balandat avatar Feb 10 '22 03:02 Balandat

The main issue I can imagine with this is that the initialization becomes too greedy and we fail to uncover a global optimum of the acquisition function that is far from the best point(s). This on its own may not actually be a bad thing since it is similar to what TuRBO/MORBO do and we may just end up avoiding points that we probably didn't want to pick in the first place. I think there will be a heavy dependency of the dimensionality here where this will probably always help in the high-dimensional setting, but it may make things worse for simpler low-dimensional problems. I echo what @Balandat said that some benchmarking would be interesting.

dme65 avatar Feb 10 '22 05:02 dme65

Thanks for putting this up @saitcakmak!

I think there are very few situations where using sample_around_best would perform worse. It is worth noting that sample_around_best does not replace the space filling (sobol) initialization heuristic. Rather, the initial conditions consist of raw_samples points from the global sobol heuristic and raw_samples points from the sample_around_best heuristic. Hence, if the sobol points identify a design far from the current best with higher acquisition value than points around the current best, then that design will be preferred in the initialization heuristic.

Using sample_around_best makes the heuristic slightly more greedy, so the one case where this could be suboptimal is if one of the sobol points with lower acquisition value is not selected as a starting point for gradient optimization, but is proximal to a better optima of the acquisition surface. However, there is still additional tempering since we do Boltzmann sampling on the acquisition values of the raw samples which should help mitigate this. Using more random restarts would also help mitigate this.

The main benefit of sample_around_best is mitigate the issue of all raw samples having zero acquisition value. This can happen for improvement-based acquisition functions when the optima are in tiny regions of the design space. For example in ZDT1, the Pareto frontier is in a small sliver of the design space and the acquisition surface quickly becomes zero basically everywhere, expect for right around the previously evaluated pareto optimal designs. Using sample_around_best makes acquisition optimization significantly more robust to these scenarios (see attached plot).

zdt1_nehvi_sample_around_best

In the high-dimensional case, the sample_around_best heuristic only perturbs a subset of the dimensions of the best point(s) (and again this complements and does not replace the global sobol heuristic). So I would expect there to be a significant performance improvement.

The main challenge to turning this on by default is that it would require unifying an interface for extracting previously evaluated designs(or best designs) from botorch models and acquisition functions. The current error handling is fairly robust, but may not cover every single edge case for any conceivable (e.g. non-gpytorch) model. @Balandat and I have chatted about adding an X_baseline/train_inputs property to all botorch models and acquisition functions, but that change would touch a lot of code. If we want to turn this on by default, we should explore that refactor.

sdaulton avatar Feb 10 '22 10:02 sdaulton

@saitcakmak has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Feb 28 '22 15:02 facebook-github-bot

@saitcakmak What is the status of this PR? It would great to get his merged in

sdaulton avatar Apr 04 '22 07:04 sdaulton

@sdaulton We're waiting on the benchmarking suite to decide whether to set sample_around_best True by default.

saitcakmak avatar Apr 04 '22 17:04 saitcakmak

We're waiting on the benchmarking suite to decide whether to set sample_around_best True by default.

@saitcakmak is this PR still relevant?

esantorella avatar Jun 05 '23 20:06 esantorella

The PR itself is not that relevant (quite a bit outdated). Making these changes to the default configs is still worth investigating though. I never got around to properly benchmark this since we had a bunch of reproducibility bugs in the benchmarking suite back then.

One thing to keep in mind: The current implementation of sample_around_best doubles the number of raw_samples. So, turning it on and off doesn't really give us an apples to apples comparison. If we're going to benchmark this, we should make sure they use the same number of raw_samples.

saitcakmak avatar Jun 05 '23 20:06 saitcakmak