egobox
egobox copied to clipboard
Incremental bayesian optimisation?
Hi,
I wonder, is it possible for the service struct to incrementally adjust to new suggestions, instead of recomputing everything from scratch in every iteration?
Hi. Could you elaborate? Do you have some suggestions or paper references in mind? At the moment I have just made a straightforward implementation without retaining any state.
Here is the test for EgorService
:
#[test]
fn test_xsinx_egor_builder() {
let ego = EgorServiceBuilder::optimize()
.configure(|conf| {
conf.regression_spec(RegressionSpec::ALL)
.correlation_spec(CorrelationSpec::ALL)
.infill_strategy(InfillStrategy::EI)
.random_seed(42)
})
.min_within(&array![[0., 25.]]);
let mut doe = array![[0.], [7.], [20.], [25.]];
let mut y_doe = xsinx(&doe.view());
for _i in 0..10 {
let x_suggested = ego.suggest(&doe, &y_doe);
doe = concatenate![Axis(0), doe, x_suggested];
y_doe = xsinx(&doe.view());
}
let expected = -15.1;
let y_opt = y_doe.min().unwrap();
assert_abs_diff_eq!(expected, *y_opt, epsilon = 1e-1);
}
You have doe
and y_doe
. In each iteration you consume all of doe
and y_doe
, to then suggest one doe
to compute a y_doe
.
Does this imply that for each iteration there is O(n^3)
(gp training) complexity?
This crate: https://crates.io/crates/friedrich claims that they are able to process additional samples in O(n^2)
time. Would it be possible to leverage this to ensure that suggest
will not be cubic?
I want to integrate egobox
with a simulated annealing method in a library of mine, but for that it cannot re-train in each step, otherwise it will be far too slow.
Yes each iteration implies a O(n^3)
cost. So far in the context of my use cases (n~100), it is not a problem. What is it for you?
Otherwise I forgot about this feature in friedrich
, I had already thought about trying to leverage on it but I did not take the time to do so. I suppose it should be feasible to adapt the EgorService
to work with friedrich
. Actually it was the point of the SurrogateBuilder
trait to work with other GP implementations but it will require some work.
In any MCMC-based heuristic, such as simulated annealing, parallel tempering or anything else, the number of steps is very large. We are talking about perhaps up to hundreds of thousands or millions.
For this, it would be very helpful if egobox
could implement incremental suggestions. I wonder, is there an even-smarter way to go about this? Perhaps handling updates in less than O(n^2) complexity?
I've recently implemented sparse GP methods in the gp
module which might be suitable for such challenge but I still have to refactor to possibly make it available with Egor
. Though to my understanding, the primary objective with EGO was to optimize expensive functions (hence the small n) using interpolating GP, I am not sure how it would perform with a regressive/noisy GP.
I want to use Egor
's potential incremental suggestions to power bayesian optimisation of up to ~ 5700 input variables (only one output however).
5700 comes from the number of qubits that the biggest quantum annealers currently have. I wrote https://github.com/brurucy/ernst to semi-simulate that.
Do you mean beside having n ~ 1e5 samples, your input dimension is 5700?! 😮
Up to that, yes.
In egobox
, high dimension is handled with PLS reduction technique and I've more or less successfully used it for something like 120-dim inputs but 5700 is definitely another level and I am not sure at all it can produce meaningful results 😞
I think you have to find out a way to reduce the dimension of your problem for, to my knowledge, no GP can handle such high dimension.
There's a caveat. All of these variables are binary: +1 or -1.
Closing stale issue, feel free to open a fresh one if needed.