BayesianOptimization icon indicating copy to clipboard operation
BayesianOptimization copied to clipboard

Parameter Types: Handling and Implementation

Open till-m opened this issue 3 years ago • 7 comments

Why? Support for non-float parameters types is by far the most-request feature that this package doesn't have as of now.

Why this issue? I would like to explicitly discuss if and how to implement parameter types. In that sense this issue isn't a feature request, but intended to server as a space to collect discussions about this topic.

How? For my perspective, Garrido-Merchán and Hernández-Lobato seems to make the most sense. This means converting the parameters within the kernel: $$\tilde{k}( x_i, x_j)=k(T(x_i), T(x_j))$$ where $T$ acts on elements of $x_i$ according to their type.

What is necessary? Essentially all of this "only" requires three functions to transform the parameters:

  • A function that converts the "canonical" representation of the parameters to the representation used by the kernel.
    • float and int parameters remain unchangend, one-hot-encoding is applied to categorical variables
  • A function that converts the kernel representation to the canonical representation, used whenever the user interacts with parameters (logs, optimizer.max(), etc)
    • float and int parameters remain unchangend, reverse the one-hot-encoding.
  • A function that converts the all-float parameter suggestions produced by _space.random_sample() and acq_max() into kernel representation.

Naturally, it requires changing a lot of other code, too; and particularly in ways that make everything a bit messier. Additionally, wrapping the kernel requires some slightly hacky python magic due to the way sklearn.gaussian_process' kernels are set up.

Alternatives Instead of offering proper support, we could simply refer users to @leifvan's implementation here. Alternatively, we could set up a notebook that demonstrates his approach. I'm almost favouring this approach since I'm worried about cluttering the API too much.

Are you able and willing to implement this feature yourself and open a pull request?

  • [x] Yes, I can provide this feature -- it's pretty much ready/functional too, I will push it soon and link it here.

till-m avatar Nov 07 '22 07:11 till-m

Hey, I must admit I've never had to use categorical parameters and as such never given it much thought :-P Some very smart people wrote that thread! Agree we should document this more explicitly. Some thoughts:

  • It doesn't actually look like it would be that much additional clutter to the API? e.g. if we just supply a custom kernel and then update pbounds to support optional typing? However I also think just a notebook demo would be quite fine.
  • Also, this package can work with arbitrary kernels, which is not documented in the examples at the moment. So from this perspective having an example that demonstrated how to construct a custom kernel would be good (I also planned to do this in the noise example that I haven't written
  • If we follow the way this is implemented by @leifvan, we should use set_gp_params instead of bo._gp - I think it should work the same way

bwheelz36 avatar Nov 07 '22 20:11 bwheelz36

Hi @bwheelz36,

I think you're right about it not cluttering the (user-facing) API too much. Internally, however, there is a substantial back-and-forth happening when transforming the parameters, especially if we want to handle everything for the user -- at least I didn't find a way around this. I think the question is ultimately how "convenient" we want to make things for endusers.

As for my draft, see here. The code itself needs a lot of clean-up but you can see how it would work from a user-perspective.

till-m avatar Nov 09 '22 15:11 till-m

Hi @bwheelz36,

I think you're right about it not cluttering the (user-facing) API too much. Internally, however, there is a substantial back-and-forth happening when transforming the parameters, especially if we want to handle everything for the user -- at least I didn't find a way around this. I think the question is ultimately how "convenient" we want to make things for endusers.

As for my draft, see here. The code itself needs a lot of clean-up but you can see how it would work from a user-perspective.

Hello till,

This feature to support non-float type is not implemented in the master branch, right? I'm trying to use the functionality on the master branch, but there are always some errors. BTW, I used the command pip install bayesian-optimization to install, it cannot process the non-float type according to your example, so do you have any suggestions? thanks a lot.

Chiu-Ping avatar Dec 19 '22 10:12 Chiu-Ping

Hi, no this feature is not on the main branch yet. I think you should be able to install the fork you reference above like this: pip install git+https://github.com/till-m/BayesianOptimization/tree/parameter-types Having said that I don't think this feature is complete yet.

bwheelz36 avatar Dec 20 '22 02:12 bwheelz36

I guess it would be quite helpful to have someone test this right @till-m ?

bwheelz36 avatar Dec 20 '22 02:12 bwheelz36

yes, expect it to be a bit unstable, but I would love some feedback :)

till-m avatar Dec 20 '22 07:12 till-m

Hi @Chiu-Ping, did you ever get around to testing the feature?

till-m avatar Apr 25 '23 13:04 till-m