grf Tuning alpha and sample.fraction in regression

Hi grf team,

I was trying to understand tuning in regression_forest() by reading through tune_forest.R and realized that the upper bound for $\alpha$, the minimum fraction of training examples in each parent node that must be put in each child node, is 0.25, exceeding the bound in Theorem 1 of Wager & Athey (2018), which is 0.2.

Specifically, the get_params_from_draw function in tune_forest.R scales alpha drawn from $Uniform[0, 1]$ by 1/4 instead of 1/5:

get_params_from_draw <- function(nrow.X, ncol.X, draws) {
  if (is.vector(draws)) {
    draws <- rbind(c(draws))
  }
  n <- nrow(draws)
  vapply(colnames(draws), function(param) {
    if (param == "min.node.size") {
      return(floor(2^(draws[, param] * (log(nrow.X) / log(2) - 4))))
    } else if (param == "sample.fraction") {
      return(0.05 + 0.45 * draws[, param])
    } else if (param == "mtry") {
      return(ceiling(min(ncol.X, sqrt(ncol.X) + 20) * draws[, param]))
    } else if (param == "alpha") {                
      return(draws[, param] / 4)       # scales alpha drawn from a uniform distribution by 1/4 instead of 1/5
    } else if (param == "imbalance.penalty") {
      return(-log(draws[, param]))
    } else if (param == "honesty.fraction") {
      return(0.5 + (0.8 - 0.5) * draws[, param]) # honesty.fraction in U(0.5, 0.8)
    } else if (param == "honesty.prune.leaves") {
      return(ifelse(draws[, param] < 0.5, TRUE, FALSE))
    } else {
      stop("Unrecognized parameter name provided: ", param)
    }
  }, FUN.VALUE = numeric(n))
}

I noticed this because in a simulation I'm doing, the tuned $\alpha$ always exceeds 0.2, the upper bound in the paper. So I'm wondering if this would be problematic for Theorem 1 of Wager & Athey (2018) to hold.

A related question is then the scaling of the subsample size $s$ as required in the paper, which depends on $d$, $\pi$ and $\alpha$, but in tune_forest.R it seems like such dependence is not enforced in how possible values of sample.fraction are drawn; the get_params_from_draw function above simply requires sample.fraction to lie between $[0.05, 0.5]$. In my simulation, sample.fraction can be very small (around 0.05) if tuning is enabled, and coverage rate seems to suffer whenever this happens due to insufficient training examples sampled for each tree, thus causing a large bias effect leading to uncentered confidence intervals.

Thank you for your time and all the work!

Alice

Aug 13 '22 09:08 yqi3

If you want parameter choices that are covered by Thm 1 of WA18, I'd recommend just setting parameters directly instead of using the tuning function. (Or, one other option would be to set sample.fraction = 0.5 and alpha = 0.2, and then tune the other parameters.)

Aug 16 '22 18:08 swager

PS: @yqi3 if you are trying out GRF in a rdd setting the new estimator lm_forest (grf version 2.2.0+) can be used for this purpose, as a local linear regression conditional on X: https://grf-labs.github.io/grf/reference/lm_forest.html#examples

Aug 16 '22 19:08 erikcs

Hi @swager: Thank you for the suggestions. I ended up modifying the tuning function so that possible values of sample.fraction are not randomly drawn, but for any random draw of alpha, set the corresponding sample.fraction to $ceil[n^{\beta_{min}}]/n$ times a constant (e.g., 0.5). It seems this way achieves better performance than using default sample.fraction and alpha and tuning the rest.

Thanks @erikcs for the reference. That's a very interesting application, but I'm trying out forests in a multiple-score setting and would like to avoid using kernel functions.

Aug 25 '22 01:08 yqi3

grf
grf copied to clipboard

Tuning alpha and sample.fraction in regression_forest

grf grf copied to clipboard

Tuning alpha and sample.fraction in regression_forest

grf
grf copied to clipboard