grf icon indicating copy to clipboard operation
grf copied to clipboard

Estimation of R-loss criterion and the Monte-carlo error using causal survival forest

Open mnose opened this issue 2 years ago • 2 comments
trafficstars

After estimating causal survival forest (Cui et al, 2023), is it possible to compute and report the debiased error (R-loss function) and the excess error (the Monte-carlo error)? Even after running the sample code of causal survival forest (https://cran.r-project.org/web/packages/grf/grf.pdf), both errors are all "NaN". The same is the case for my own data. I cannot find them in the original paper too.

Are they not supported in the current code? If so, is there a way to compute them or report alternative test statistics to assess the csf estimates?

mnose avatar Sep 05 '23 05:09 mnose

Hi @mnose, one suggested way to assess the CSF estimates would be to use the TOC/RATE as mentioned at the end of the CSF docstring example.

debiased.error predictions are currently not implemented, but the implied R-loss could be backed out with

tau.hat <- predict(forest)$predictions
mean(
  ((forest[["_psi"]]$numerator - forest[["_psi"]]$denominator * tau.hat) / (forest$W.orig - forest$W.hat))^2
)

This would be the appropriate loss to tune a CSF with. But given a fit CSF (default parameters are typically reasonable), the RATE is more informative.

erikcs avatar Sep 05 '23 05:09 erikcs

Thank you for the information.

After estimating CATE using the causal survival forest, I examine the partial dependence of CATE (i.e., how the estimate changes when changing only a single variable, while keeping all other variables at median, evaluated at each quintile. I could get the partial dependence plot similar to the one using causal forest in this website (https://gsbdbi.github.io/ml_tutorial/hte_tutorial/hte_tutorial.html)

Using the estimated coefficient and standard error of a particular covariate for each quintile, is there a way to perform the equality of coefficient test across quintiles (i.e. b_Q1 = b_Q2 = b_Q3 = b_Q4 = b_Q5) under causal survival forest? I wonder performing usual hypothesis tests on linear combinations of coefficients would probably not appropriate for this. Is there an appropriate alternative way to perform this (with adjustments for multiple hypothesis testing)?

mnose avatar Sep 13 '23 21:09 mnose