emukit icon indicating copy to clipboard operation
emukit copied to clipboard

Implementation of preferential Bayesian optimization.

Open javiergonzalezh opened this issue 6 years ago • 14 comments

As detailed in this paper: https://arxiv.org/abs/1704.03651

javiergonzalezh avatar Dec 19 '18 11:12 javiergonzalezh

Any updates on this feature?

danielwinkler avatar Jul 03 '20 07:07 danielwinkler

@danielwinkler , thanks to @esiivola we have an example of preferential bayes opt implemented here: https://github.com/amzn/emukit/tree/master/emukit/examples/preferential_batch_bayesian_optimization . It's not an original that Javier linked to above, but can serve as, well, example of how it can be implemented with Emukit

apaleyes avatar Jul 03 '20 07:07 apaleyes

@apaleyes wow that is amazing, thanks a lot for your fast response!

danielwinkler avatar Jul 03 '20 07:07 danielwinkler

To add, the acquisition functions used in the example are not exactly as in https://arxiv.org/abs/1704.03651 However, the original paper has limitations in the scalability w.r.t. the number of attributes in x. The implemented approach solves that issue and also makes it possible to compare more than two items.

esiivola avatar Jul 03 '20 07:07 esiivola

@esiivola thanks a lot for the implementation and the additional information

A question that is also related to this topic: All papers i seem to read address the limitation of PBO to high dimensional spaces On the other hand, the IBO of Brochu already did preferential BO for high dimensional parameter settings with very nice results

I wonder if this is due to the possible unimodal nature of the parameter space of his example applications?

Do you have any explanation / intuition in this respect?

danielwinkler avatar Jul 03 '20 08:07 danielwinkler

The IBO approach of Brochu only presents one acquisition strategy which is very exploitative. Some consider this to be a limitation as well (Which Gonzalez et al. solve in their paper).

The discussion in the papers you have read might be related to the unimodal nature of their application domain. I have to admit, I'm not an expert on animation design so I cannot really give a good answer on this. Gonzalez et al. however compare their approach to IBO and clearly perform better. This might be so because they might be able to capture the uncertainty of the latent space better (by extending it). You can see this by comparing the results of IBO and CEI, which both are exploitative acquisition strategies. However, this comes with the cost of not being very scalable.

esiivola avatar Jul 03 '20 08:07 esiivola

So if you only need to compare two items (and not a batch of items) AND your input is not very high dimensional, the approach of Gonzalez et al. might be the best option for you (but the source code is not openly available). If not, then it might be a good idea to try our approach! If you are only comparing two items and are using q-EI, then the implemented example becomes IBO.

esiivola avatar Jul 03 '20 08:07 esiivola

This is the only open source implementation of the paper by Gonzalez. et al. that I know of: https://github.com/fcole90/minimal_pbo

However, they do not implement any acquisition function, but it might be easy to extend their code.

esiivola avatar Jul 03 '20 08:07 esiivola

@esiivola Thanks a lot for your support

I only need to compare two items BUT it is high dimensional

From reading papers, this should not work; however, my hope is that the problem nature is similar to the application domain of Brochu. In a way you can think of it of tuning a single or two channels of a sound mixer (volume, pan, low freq, mid freq, high freq, ...) for a given sound and environment to the preference of a user / expert

  • each of the knobs (parameters) should be unimodal
  • there may also be correlations between the parameters (e.g. the freqs) (that should be applicable to the covariance of the GP)

I will have a look at minimal pbo that you linked, and definitely also at your implementation

I really appreciate all the effort you put into helping me!

danielwinkler avatar Jul 03 '20 08:07 danielwinkler

Knowing your application, if you can also ask the user for optimal knob position given that other parameters are fixed, you might also be interested in this upcoming ICML 2020 paper: https://arxiv.org/pdf/2002.03113.pdf

esiivola avatar Jul 03 '20 08:07 esiivola

Thanks,

I skimmed through the paper already this morning, I have to read it in detail but so far it is not clear to me how to express the problem as projective preferential query.

What you are suggesting is that the expert user would be able to the tell the optimal value for a knob, given the the acquisition strategy prescribes the values of the remaining dimensions?

danielwinkler avatar Jul 03 '20 09:07 danielwinkler

Hi!

Yes, exactly! If so, the whole optimization process would be much faster than with a single preferential query. They actually are inputting the "optimal knob position" as multiple preferential feedbacks, allowing the model to learn much faster.

esiivola avatar Jul 03 '20 09:07 esiivola

great, I will look deeper into that direction as the results of PPBO look really impressive!

All the best, Daniel

danielwinkler avatar Jul 03 '20 09:07 danielwinkler

Hi,

I am also struggling on the implementation of the acquisition function. Following the work of Hernandez2014, the Gaussian process prior for f can be approximated by the inner product of a vector of random features and \theta, where the prior of \theta is a standard multivariate normal distribution. In the experiments of PBO, the length of the vector seems to be unknown. More importantly, \theta is randomly drawn in a posterior, where based on my derivations, P(\theta| Dt) is proportional to P(\theta)*P(Dt|\theta), which is also proportional to P(\theta)P(y|X,\theta). The final form of the posterior is C*P(\theta)*sigmoid(y1*\phi(x1)^T*\theta)*...*sigmoid(yn*\phi(xn)^T*\theta), where C is a normalization factor. Sampling from such a posterior of \theta seems to be computationally hard especially when \theta is high-dimensional (e.g. 1000 in Hernandez2014). I am not sure where my thoughts went wrong. I would really appreciate if there is any help.

Sorry for the bad editing. I really hope Github could add the feature of editing mathematical formulas for efficient discussions. :)

Best, Jinnian

DominickZhang avatar Jul 09 '20 16:07 DominickZhang