It would be nice to support different data types---e.g. int, float, bool, and maybe a categorical string---for the parameters over which we optimize. I am not sure what the syntax would look like, except for maybe a list of datatypes passed in that corresponds to the parameter bounds.

All three of these types could be handled the same way, with int being drawn uniformly from the integer interval specified, bool being drawn uniformly from {0, 1}, and categorical strings being mapped to a drawing from integer values [0, 1, ..., n_categories-1] or one-hot encoded as @PedroCardoso suggested below.

See [E. C. Garrido-Merchan and D. Hernandez-Lobato, 2017] for one approach.

May 10 '18 13:05 Engineero

Interesting, the kernel change they propose wouldn't be too hard to implement. My only concern is making the API more and more cumbersome by piling features. However this one is requested often enough to be worth considering.

May 18 '18 16:05 fmfn

I can try to take a look at it too. I'll let you know if I get anywhere.

May 18 '18 16:05 Engineero

Could I propose a different approach to categorical strings data types ? I would suggest a one-hot implementation, in practice creating n bool dimensions on the search space. categorical types are independent.

Jul 20 '18 15:07 PedroCardoso

Did anyone advanced on this ?

Jul 20 '18 16:07 PedroCardoso

It would be really useful to have these types supported.

Aug 06 '18 23:08 dehdari

+1 I'd like this too!

Nov 27 '18 15:11 guidocalvano

+1 I'd like this too!

Dec 12 '18 03:12 dingtine

I proposed an implementation for the type integer in a merge request. It is more a first shot than a terminated work, improvements can be done. Could anyone take a look to discuss on this please?

Jan 11 '19 17:01 jmehault

Great suggestion! Parameter typing would be really useful, specially for categorical parameters.

Jan 23 '19 18:01 gustavolvieira

+1 I would like to use integers

Mar 22 '19 15:03 janwendt

Is it possible to exclude specific points in the bounds? I mean, when defining BayesianOptimization(f=black_box_function, pbounds={'e': (0, 1)}) I do not actually want 'e' to be 0. Surely I can write pbounds={'e': (0.0001, 1)} or something like that but it is not nice.

Another thing, it sometime "gets stuck" on points (iter 10-16) which seems a waste:

| iter | target | e |

| 1 | 0.7492 | 0.2963 | | 2 | 0.03762 | 0.7072 | | 3 | 0.6771 | 0.2084 | | 4 | 0.4013 | 0.4408 | | 5 | 0.03448 | 0.9871 | | 6 | 0.3762 | 0.001 | | 7 | 0.7429 | 0.2671 | | 8 | 0.7461 | 0.287 | | 9 | 0.721 | 0.3317 | | 10 | 0.7492 | 0.2928 | | 11 | 0.7492 | 0.2927 | | 12 | 0.7492 | 0.2925 | | 13 | 0.7492 | 0.2928 | | 14 | 0.7492 | 0.2917 | | 15 | 0.7492 | 0.291 | | 16 | 0.7492 | 0.2945 | | 17 | 0.7461 | 0.2891 | | 18 | 0.7367 | 0.3024 | | 19 | 0.03448 | 0.8466 | | 20 | 0.05016 | 0.5746 |

Is it intentional?

Apr 16 '19 08:04 friedsela

There's no way to exclude boundary points. I don't think the extra complexity would justify. As you mentioned you can simply use (1e-4, 1), or something like it, since the choice of lower bound should be immaterial. If, however, you believe the difference between picking 1e-5 or 1e-3 as a lower bound is important, you should transform this variable to a log scale.

And in this particular example the optimizer was not stuck, it was simply exploiting the maximum region around 0.292....

Apr 17 '19 14:04 fmfn

Thank you.

Apr 17 '19 14:04 friedsela

+1 any updates? thanks.

Dec 17 '19 02:12 atisman89

I'm looking to use this for booleans, any updates?

Apr 28 '21 09:04 marcelroed

BayesianOptimization
BayesianOptimization copied to clipboard

Support different data types for optimization parameters

| iter | target | e |

BayesianOptimization BayesianOptimization copied to clipboard

Support different data types for optimization parameters

| iter | target | e |

BayesianOptimization
BayesianOptimization copied to clipboard