mlrMBO
mlrMBO copied to clipboard
Bug: small batch size with categorical variables
The link below is a standalone script for replicating the error to file the bug fix with mlrMBO
https://github.com/rajeeja/mlrmbo-bug/blob/master/mlrMBOMixedIntegerTest11a.R
Please let me know if you need more details.
Hi, you are using the initial design in a weird way. It is simply too small for your big search space.
Why do you generate the design with max.budget
points to then only take the first 5 (propose.points
).
Your initial design has to contain each discrete value at least once so that the surrogate can make predictions.
For me it works with design = generateDesign(n = 30, par.set = getParamSet(obj.fun))
@jakob-r Thanks! But "Your initial design has to contain each discrete value at least once so that the surrogate can make predictions." is not sufficient if I use the learner below:
surr.rf = makeLearner("regr.randomForest", predict.type = "se", fix.factors.prediction = TRUE, se.method = "bootstrap", se.boot = 2)
res = mbo(obj.fun, design = design, learner = surr.rf, control = ctrl, show.info = TRUE)
Complete isolated example is here https://github.com/rajeeja/mlrmbo-bug/blob/master/learner-discrete-param-bug.R
True, my answer is kind of restricted to the surrogate. However, I have doubts that the surrogate will work so well, especially the uncertainty estimation for unknown factors. I am curious to see results of any optimization benchmark using this approach :slightly_smiling_face:
Even if I increase the propose.points to 1000, I get the error: Error in predict.randomForest(getLearnerModel(x), newdata = .newdata, : New factor levels not present in the training data
for this example: https://github.com/rajeeja/mlrmbo-bug/blob/master/learner-discrete-param-bug.R
What should be a fix for getting something like this to work?
changing surr.rf = makeLearner("regr.randomForest",
predict.type = "se",
fix.factors.prediction = TRUE,
se.method = "bootstrap",
se.boot = 8)
to
surr.rf = makeLearner("regr.randomForest",
predict.type = "se",
fix.factors.prediction = TRUE,
)
it works. I'll update you about results from this approach. Also older version works even with se->
just found that changing the se.method = "bootstrap", to
se.method = "jackknife",
works.