egobox icon indicating copy to clipboard operation
egobox copied to clipboard

Incremental bayesian optimisation?

Open brurucy opened this issue 11 months ago • 10 comments

Hi,

I wonder, is it possible for the service struct to incrementally adjust to new suggestions, instead of recomputing everything from scratch in every iteration?

brurucy avatar Mar 24 '24 12:03 brurucy

Hi. Could you elaborate? Do you have some suggestions or paper references in mind? At the moment I have just made a straightforward implementation without retaining any state.

relf avatar Mar 24 '24 18:03 relf

Here is the test for EgorService:

#[test]
    fn test_xsinx_egor_builder() {
        let ego = EgorServiceBuilder::optimize()
            .configure(|conf| {
                conf.regression_spec(RegressionSpec::ALL)
                    .correlation_spec(CorrelationSpec::ALL)
                    .infill_strategy(InfillStrategy::EI)
                    .random_seed(42)
            })
            .min_within(&array![[0., 25.]]);

        let mut doe = array![[0.], [7.], [20.], [25.]];
        let mut y_doe = xsinx(&doe.view());
        for _i in 0..10 {
            let x_suggested = ego.suggest(&doe, &y_doe);

            doe = concatenate![Axis(0), doe, x_suggested];
            y_doe = xsinx(&doe.view());
        }

        let expected = -15.1;
        let y_opt = y_doe.min().unwrap();
        assert_abs_diff_eq!(expected, *y_opt, epsilon = 1e-1);
    }

You have doe and y_doe. In each iteration you consume all of doe and y_doe, to then suggest one doe to compute a y_doe.

Does this imply that for each iteration there is O(n^3) (gp training) complexity?

This crate: https://crates.io/crates/friedrich claims that they are able to process additional samples in O(n^2) time. Would it be possible to leverage this to ensure that suggest will not be cubic?

I want to integrate egobox with a simulated annealing method in a library of mine, but for that it cannot re-train in each step, otherwise it will be far too slow.

brurucy avatar Mar 24 '24 19:03 brurucy

Yes each iteration implies a O(n^3) cost. So far in the context of my use cases (n~100), it is not a problem. What is it for you?

Otherwise I forgot about this feature in friedrich, I had already thought about trying to leverage on it but I did not take the time to do so. I suppose it should be feasible to adapt the EgorService to work with friedrich. Actually it was the point of the SurrogateBuilder trait to work with other GP implementations but it will require some work.

relf avatar Mar 25 '24 09:03 relf

In any MCMC-based heuristic, such as simulated annealing, parallel tempering or anything else, the number of steps is very large. We are talking about perhaps up to hundreds of thousands or millions.

For this, it would be very helpful if egobox could implement incremental suggestions. I wonder, is there an even-smarter way to go about this? Perhaps handling updates in less than O(n^2) complexity?

brurucy avatar Mar 25 '24 10:03 brurucy

I've recently implemented sparse GP methods in the gp module which might be suitable for such challenge but I still have to refactor to possibly make it available with Egor. Though to my understanding, the primary objective with EGO was to optimize expensive functions (hence the small n) using interpolating GP, I am not sure how it would perform with a regressive/noisy GP.

relf avatar Mar 25 '24 11:03 relf

I want to use Egor's potential incremental suggestions to power bayesian optimisation of up to ~ 5700 input variables (only one output however).

5700 comes from the number of qubits that the biggest quantum annealers currently have. I wrote https://github.com/brurucy/ernst to semi-simulate that.

brurucy avatar Mar 25 '24 11:03 brurucy

Do you mean beside having n ~ 1e5 samples, your input dimension is 5700?! 😮

relf avatar Mar 25 '24 13:03 relf

Up to that, yes.

brurucy avatar Mar 25 '24 16:03 brurucy

In egobox, high dimension is handled with PLS reduction technique and I've more or less successfully used it for something like 120-dim inputs but 5700 is definitely another level and I am not sure at all it can produce meaningful results 😞 I think you have to find out a way to reduce the dimension of your problem for, to my knowledge, no GP can handle such high dimension.

relf avatar Mar 25 '24 17:03 relf

There's a caveat. All of these variables are binary: +1 or -1.

brurucy avatar Mar 25 '24 18:03 brurucy

Closing stale issue, feel free to open a fresh one if needed.

relf avatar May 28 '24 13:05 relf