scikit-optimize icon indicating copy to clipboard operation
scikit-optimize copied to clipboard

gbrt_minimize and forest_minimize documentation

Open topspinj opened this issue 7 years ago • 5 comments

I am reading through skopt's docs and it seems like there might be a typo in the description for both gbrt_minimize and forest_minimize.

gbrt_minimize:

Gradient boosted regression trees are used to model the (very) expensive to evaluate function func. 

forest_minimize:

A tree based regression model is used to model the expensive to evaluate function func. 

What is the above supposed to say?

topspinj avatar Jun 02 '18 00:06 topspinj

The descriptions could be less cryptic. Is there also a typo in them?

The basic idea behind all the methods in scikit-optimize is to use a surrogate model and fit it to the responses from the function that we are trying to minimise. There are many different kinds of models that can be used for this. In gbrt_minimize and forest_minimize we use gradient boosted trees and a random forest respectively to do that job. Maybe together we can work out a better one sentence summary and create a PR?

betatim avatar Jun 02 '18 00:06 betatim

@betatim Sounds good. What is "the expensive" referring to?

topspinj avatar Jun 02 '18 01:06 topspinj

How about something like:

A tree based regression model is used to evaluate (or optimize/minimize?) the objective function func. The model is improved by sequentially evaluating the expensive function at the next best point.

Perhaps remove 'expensive' from the first sentence since the second sentence also describes the function as expensive?

topspinj avatar Jun 02 '18 15:06 topspinj

What is "the expensive" referring to?

scikit-optimize focusses on minimising expensive objective functions. An example could be finding the best hyper-parameters of a neural network that takes days to train, or finding the best parameters of a real world process where each trial costs $1000000 to run.

I wouldn't say "used to evaluate" as the surrogate model doesn't evaluate the objective, it is used to model it. The combination of surrogate model and the logic behind how to pick the next point is what leads to finding the minimum of the objective.

betatim avatar Jun 04 '18 04:06 betatim

To me, it is not clear how exactly trees are built in forest_minimize and gbrt_minimize. Do they build a tree for each dimension? Or each tree takes as input the whole dimension space?

I think that adding references could help the understanding of the methods, see https://github.com/scikit-optimize/scikit-optimize/issues/118

00sapo avatar Mar 25 '21 09:03 00sapo