Bayesian-Optimization
Bayesian-Optimization copied to clipboard
Boolean variables not supported in combination with other categoricals
Describe the bug When the search space contains a boolean variable in addition to another categorical variable which is non-boolean, the search will fail.
To Reproduce The following is a slight modification of an existing test to show the problem `dim_r = 2 # dimension of the real values def obj_fun(x): x_r = np.array([x['continuous_%d'%i] for i in range(dim_r)]) x_i = x['ordinal'] x_d = x['nominal'] _ = 0 if x_d == 'OK' else 1 return np.sum(x_r ** 2) + abs(x_i - 10) / 123. + _ * 2
search_space = ContinuousSpace([-5, 5], var_name='continuous') * dim_r +
OrdinalSpace([5, 15], var_name='ordinal') +
NominalSpace(['OK', 'A', None], var_name='nominal') +
NominalSpace([True, False], var_name='boolvar')
model = RandomForest(levels=search_space.levels)
opt = ParallelBO( search_space=search_space, obj_fun=obj_fun, model=model, max_FEs=6, DoE_size=3, # the initial DoE size eval_type='dict', acquisition_fun='MGFI', acquisition_par={'t' : 2}, n_job=3, # number of processes n_point=3, # number of the candidate solution proposed in each iteration verbose=False # turn this off, if you prefer no output ) xopt, fopt, stop_dict = opt.run()`
Expected behavior This should perform exactly the same as the case without boolean variable
Additional context It seems to be related to the checking of input in the random forest
Somewhere in the random forest these values are converted to strings, which is leading to this issue. I haven't yet found exactly where this issue occurs, but it is not just limited to boolean variable, but any nominal space where the options are not strings originally share this problem.
@Dvermetten I actually have a similar issue (maybe the same). I used pip to install the latest version and I am getting errors of the form:
ValueError: Found unknown categories ['22', '18', '26', '96'] in column 0 during transform
. I am guessing it's what you said above because I have a specific nominal range the elements of which are not strings originally: max_depth = NominalSpace([None] + np.arange(2,102,2).tolist())
Is there an update/fix on this? I also think the pip version is an older version than what is now on master...
Thanks!