skgrf How should tuning be implemented?

How should tuning be implemented?

Open crflynn opened this issue 4 years ago • 0 comments

GRF includes tuning facilities for many of the estimators. In particular, the following estimators have tuning parameter options:

Regression forest
Causal forest
Instrumental forest
Local linear forest
Boosted forest
Causal survival forest

In addition, some forests use tuning implicitly, and/or pass tuning parameters down into internal forests.

Causal forest performs tuning but also passes tune params down into the orthogonalization forests (regression and boosted) in which tuning is performed separately.
Instrumental forest performs tuning but also passes tune params down into the orthogonalization regression forest in which tuning is performed separately
Boosted forest uses tune params on the initial forest, but not the boosted ones

Scikit-learn also provides facilities for hyperparameter tuning under the model_selection module. This begs the question: When and where in skgrf should tuning be implemented, if at all?

Make skgrf a true port of R-grf. This means implementing tuning exactly as it exists in the R lib, ignoring sklearn model selection, and hardcoding tuning in the same way.
Ignore R-grf's tuning entirely, allowing users to utilize the model_selection module. This means however, that the implementations for Causal, Instrumental, and Boosted forests would be different than what exists in R.
Selectively implement R-grf's tuning, in order to maintain parity with R-grf's implicit tuning. This is the current implementation.
Refactor some of the estimators to allow more fine-grained control of tuning separate components, removing tuning from skgrf and allowing users to tune with model_selection objects.

Feb 21 '21 22:02 crflynn

skgrf skgrf copied to clipboard

How should tuning be implemented?

skgrf
skgrf copied to clipboard