AugmentedGaussianProcesses.jl
AugmentedGaussianProcesses.jl copied to clipboard
Bayesian Optimisation
(Apologies if it seems like I've been spamming too many issues!)
BayesianOptimisation.jl is a neat little package built on top of GaussianProcesses.jl that provides support for Bayesian Optimisation, a common application of Gaussian Process. It would be quite useful if this could be made to work with AGP.jl in order to benefit from the added flexibility provided by KernelFunctions.jl.
Having a look at the code of BO.jl it seems that adding support for AGP.jl should not be too difficult in principle. One crucial ingredient is the ability to update GPs on the fly, which seems less straightforward with AGP.jl compared to GP.jl's fit!
function. The second requires implementing maximum a posteriori (MAP) estimation of kernel hyperparameters, which is part of BO.jl. For now one could simply replace this by maximum likelihood (ML) estimation, but adding support for priors over kernel parameters one way or another could be very useful. I think the train! function basically performs ML estimation, but having to specify the number of iterations is a bit less convenient than the automatic approach adapted in GP.jl.
No problem with the issues :)
I think having BO integrated is definitely not the the current objective of this package, especially since other package do it so well already :) But I would be open if someone wants to try to do a PR!
I was more thinking of adding AGP support to BO.jl. The one thing I'm not sure how to implement is to update the GP at every step; it seems like AGP is more geared towards static GPs with fixed sample sizes...
Actually a great project would be to implement Bayesian Optimisation with AbstractGPs.jl, AGP will at some point (soon) directly be based on it
I will have a look at the documentation of AbstractGPs.jl - although as a newcomer the multiplicity of approaches can get confusing at times!
We are actually working on unifying everything. AbstractGPs is a project in this direction!
If I'm correct now Stheno relies on AbstractGPs instead of providing its own implementation, is that correct? I would propose updating the GP comparison page in that case - having more up-to-date information would be quite useful!
In addition there two multiple options for Bayesian Optimisation here. One would be to try and provide another backend for AugmentedGaussianProcesses.jl, the other would be to create an alternative along the lines of Stheno. I would skew towards the former as I would like to prevent package fragmentation, except if there is an almost trivial way to do BO by composing other packages (I'm not too familiar with Zygote, Flux etc.). Any thoughts?
I would propose updating the GP comparison page in that case - having more up-to-date information would be quite useful!
I will! It's just that things are moving a lot this time and I wanted to wait for a clear view about what's there. In any case this page probably belong more somewhere in the JuliaGaussianProcesses organisation
I would skew towards the former as I would like to prevent package fragmentation
I personally think that it would be much more efficient to build a separate package lying in JuliaGaussianProcesses solely based on AbstractGPs.jl but I don't think anyone in the organisation currently has the time for that. Using Stheno is not even necessary since mostly standard GPs are used for BO.
Another option would be to change the backend of BayesianOptimisation.jl to AbstractGPs but I don't know how much work this would require and if the package owners would be open to it.
I would be open to trying that actually, I have some experience with Bayesian Optimisation although I am (as mentioned) less familiar with the modern Julia way of doing things.
A simple initial idea would be to implement a boptimize!
function just as with BayesianOptimisation.jl
.
Questions to consider:
- how to deal with incremental GP building - afair AbstractGPs doesn't work well with this, so should one construct a new GP after adding training points?
- should there a generic optimisation interface or should it focus on one package like NLOpt? How about MathOptInterface?
how to deal with incremental GP building - afair AbstractGPs doesn't work well with this, so should one construct a new GP after adding training points?
I am not sure, you could update the Cholesky via https://github.com/JuliaGaussianProcesses/AbstractGPs.jl/blob/5bdd823df2e583e6783b9dbd5004eaccdb0ce27d/src/util/common_covmat_ops.jl#L35 But you're right that a lot of the functionalities are missing.
should there a generic optimisation interface or should it focus on one package like NLOpt? How about MathOptInterface?
I think one could use https://github.com/SciML/GalacticOptim.jl which is a wrapper other optimisers. But any other interface would do!
I think for now we might need to rebuild the GP after each training step. This is quite expensive, but one could provide more efficient implementations later on. In any case trying to do BO with AbstractGPs could be a nice way to spot missing features.
Ideally one could support multiple optimisation backends, but I will have a look at GalacticOptim.
Although GalacticOptim has Flux and Optim as two rather large dependencies...
Some initial thoughts about the current AbstractGP API after a bit of experimentation:
- There seems to be no simple way of accessing observations in a PosteriorGP except via data.x and data.δ, a public API could be useful
-
rand(gp, x)
as an alternative torand(gp(x))
, similar tomean
,cov
etc. -
convert(Normal, gp, x)
orconvert(MvNormal, gp, X)
would be convenient
These are good points! For the last point FiniteGP are already inheriting from the AbstractMvNormal so the convert probably is not needed
Could you please open an issue on AbstractGPs?
Although GalacticOptim has Flux and Optim as two rather large dependencies...
Depending on your exact needs, you could only depend on SciMLBase but not on GalacticOptim. In SciMLBase we define e.g. OptimizationFunction and OptimizationProblem (e.g. https://github.com/SciML/SciMLBase.jl/blob/e44240ff63a0eae590ba25653ad1b164714585aa/src/problems/basic_problems.jl#L122-L142), so it is sufficient if you want to define an optimization problem for GalacticOptim and e.g. let users provide the optimizer separately. Then users would have to load GalacticOptim and whatever optimization packages they want to use but it would not have to be a dependency of your package.