grf
grf copied to clipboard
Tuning alpha and sample.fraction in regression_forest
Hi grf team,
I was trying to understand tuning in regression_forest()
by reading through tune_forest.R and realized that the upper bound for $\alpha$, the minimum fraction of training examples in each parent node that must be put in each child node, is 0.25, exceeding the bound in Theorem 1 of Wager & Athey (2018), which is 0.2.
Specifically, the get_params_from_draw
function in tune_forest.R scales alpha
drawn from $Uniform[0, 1]$ by 1/4 instead of 1/5:
get_params_from_draw <- function(nrow.X, ncol.X, draws) {
if (is.vector(draws)) {
draws <- rbind(c(draws))
}
n <- nrow(draws)
vapply(colnames(draws), function(param) {
if (param == "min.node.size") {
return(floor(2^(draws[, param] * (log(nrow.X) / log(2) - 4))))
} else if (param == "sample.fraction") {
return(0.05 + 0.45 * draws[, param])
} else if (param == "mtry") {
return(ceiling(min(ncol.X, sqrt(ncol.X) + 20) * draws[, param]))
} else if (param == "alpha") {
return(draws[, param] / 4) # scales alpha drawn from a uniform distribution by 1/4 instead of 1/5
} else if (param == "imbalance.penalty") {
return(-log(draws[, param]))
} else if (param == "honesty.fraction") {
return(0.5 + (0.8 - 0.5) * draws[, param]) # honesty.fraction in U(0.5, 0.8)
} else if (param == "honesty.prune.leaves") {
return(ifelse(draws[, param] < 0.5, TRUE, FALSE))
} else {
stop("Unrecognized parameter name provided: ", param)
}
}, FUN.VALUE = numeric(n))
}
I noticed this because in a simulation I'm doing, the tuned $\alpha$ always exceeds 0.2, the upper bound in the paper. So I'm wondering if this would be problematic for Theorem 1 of Wager & Athey (2018) to hold.
A related question is then the scaling of the subsample size $s$ as required in the paper, which depends on $d$, $\pi$ and $\alpha$, but in tune_forest.R it seems like such dependence is not enforced in how possible values of sample.fraction
are drawn; the get_params_from_draw
function above simply requires sample.fraction
to lie between $[0.05, 0.5]$. In my simulation, sample.fraction
can be very small (around 0.05) if tuning is enabled, and coverage rate seems to suffer whenever this happens due to insufficient training examples sampled for each tree, thus causing a large bias effect leading to uncentered confidence intervals.
Thank you for your time and all the work!
Alice
If you want parameter choices that are covered by Thm 1 of WA18, I'd recommend just setting parameters directly instead of using the tuning function. (Or, one other option would be to set sample.fraction = 0.5 and alpha = 0.2, and then tune the other parameters.)
PS: @yqi3 if you are trying out GRF in a rdd setting the new estimator lm_forest
(grf version 2.2.0+) can be used for this purpose, as a local linear regression conditional on X: https://grf-labs.github.io/grf/reference/lm_forest.html#examples
Hi @swager: Thank you for the suggestions. I ended up modifying the tuning function so that possible values of sample.fraction
are not randomly drawn, but for any random draw of alpha
, set the corresponding sample.fraction
to $ceil[n^{\beta_{min}}]/n$ times a constant (e.g., 0.5). It seems this way achieves better performance than using default sample.fraction
and alpha
and tuning the rest.
Thanks @erikcs for the reference. That's a very interesting application, but I'm trying out forests in a multiple-score setting and would like to avoid using kernel functions.