optunity
optunity copied to clipboard
Integer Hyperparameters
Love optunity. I can't seem to figure out how to optimize an integer hyperparameter. Here is my function:
@optunity.cross_validated(x=df_x.as_matrix(), y=df_y.as_matrix(), num_folds=5) def xgb_model(x_train, y_train, x_test, y_test, max_depth=9, min_child_weight=1, subsample=0.8, colsample_bytree=0.8): learning_rate = .1 n_estimators = 400
clf = XGBClassifier( learning_rate=learning_rate,
n_estimators=n_estimators,
max_depth=max_depth,
min_child_weight=min_child_weight,
gamma=0,
subsample=subsample,
colsample_bytree=colsample_bytree,
objective= 'binary:logistic',
max_delta_step = 1, # For imbalanced data set
scale_pos_weight=scale_pos_weight,
nthread=8,
seed=27)
msg("INFO", "Training model...")
clf.fit(x_train, y_train, eval_metric='auc')
y_prob = clf.predict_proba(x)[:,1]
auc = optunity.metrics.roc_auc(y_test, y_prob)
return auc
But when I run it, optunity is passing a float as max_depth, which isn't allowed: ... File "/usr/local/lib/python2.7/site-packages/optunity/constraints.py", line 266, in func return f(_args, *_kwargs) File "/usr/local/lib/python2.7/site-packages/optunity/cross_validation.py", line 403, in call scores.append(self.f(**kwargs)) File "./my-model.py", line 274, in xgb_model clf.fit(x_train, y_train, eval_metric='auc') ... File "/usr/local/lib/python2.7/site-packages/xgboost/core.py", line 806, in update _check_call(_LIB.XGBoosterUpdateOneIter(self.handle, iteration, dtrain.handle)) File "/usr/local/lib/python2.7/site-packages/xgboost/core.py", line 127, in _check_call raise XGBoostError(_LIB.XGBGetLastError()) XGBoostError: Invalid Parameter format for max_depth expect int but value='10.02734375'
The variables are treated as float's in optunity, so you have to convert them back to int
when calling XGBClassifier
. For example:
clf = XGBClassifier( learning_rate=learning_rate,
n_estimators=n_estimators,
max_depth=int(max_depth),
...
Any variable that requires to be int has to be passed through int(x)
or int(round(x))
.
Yes, we can convert the values, but that doesn't constrain the search space for optimization. The whole point is to constrain the search space.
You're right, postprocessing the reals won't properly constrain the search space, but the results are nearly identical in terms of convergence speed. Note that Optunity's solvers are all continuous at this moment, so rounding is the best we can do at this point.
In the upcoming release we will have more support for integer and other value types, and will also be able to recognize duplicate candidate solutions more effectively. That said, the internal solvers still remain continuous. We have no concrete plans to add integer/mixed type solvers at the moment.