grf
grf copied to clipboard
Interpretation for best linear projection
Dear grf team Thank you for making such a wonderful package. I am currently analyzing the data using causal forest. While checking the best_linear_projection results, I encountered two questions.
- In the Best_linear_projection function, the statistical significance changes significantly depending on the matrix composed of covariate. In other words, The results of performing BLP with all variables differ from those of performing BLP with only variables of interest (see the results below). I wonder if it is reasonable to proceed with variable selection such as stepwise in the linear model and perform BLP by constructing a matrix with only the selected variables.
Result for BLP with all variables
Best linear projection of the conditional average treatment effect.
Confidence intervals are cluster- and heteroskedasticity-robust (HC3):
Estimate Std. Error t value Pr(>|t|)
(Intercept) -8.3923e-01 3.4658e-01 -2.4215 0.01584 *
PRAPACHE 1.7305e-02 9.8373e-03 1.7591 0.07923 .
AGE 7.5675e-03 3.0559e-03 2.4764 0.01363 *
BLGCS -1.3452e-02 1.5681e-02 -0.8578 0.39143
ORGANNUM 2.9168e-02 6.1689e-02 0.4728 0.63656
BLIL6 7.4330e-07 5.6685e-07 1.3113 0.19042
BLLPLAT 4.7364e-04 3.4598e-04 1.3690 0.17168
BLLBILI 1.0151e-02 1.2267e-02 0.8276 0.40836
BLLCREAT -4.9351e-03 1.2213e-02 -0.4041 0.68635
TIMFIRST 5.8356e-05 4.6491e-05 1.2552 0.21005
BLADL -1.0051e-02 1.2494e-02 -0.8045 0.42154
blSOFA -4.6013e-04 2.2480e-02 -0.0205 0.98368
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Result for BLP with interested variables
Best linear projection of the conditional average treatment effect.
Confidence intervals are cluster- and heteroskedasticity-robust (HC3):
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.8888329 0.2125291 -4.1822 3.449e-05 ***
PRAPACHE 0.0222930 0.0069279 3.2179 0.001381 **
AGE 0.0066299 0.0028854 2.2977 0.022020 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
- In result of 'Result for BLP with interested variables', I interpreted the result as "When Age increases by one unit, CATE increases by 0.0066 and is statistically significant." I wonder if this interpretation is valid.
Best regards.
Hi @Einspanner8888, You can interpret the BLP as just another linear regression, but that under the hood uses a more involved construction for the LHS. Thus doing your favorite stepwise selection procedure, and interpreting coefficients as you would with an OLS regression, is perfectly reasonable (but of course the usual considerations around multiple selection and inference post model selection apply).