grf Interpretation for best linear projection

Interpretation for best linear projection

Open Einspanner8888 opened this issue 2 years ago • 1 comments

trafficstars

Dear grf team Thank you for making such a wonderful package. I am currently analyzing the data using causal forest. While checking the best_linear_projection results, I encountered two questions.

In the Best_linear_projection function, the statistical significance changes significantly depending on the matrix composed of covariate. In other words, The results of performing BLP with all variables differ from those of performing BLP with only variables of interest (see the results below). I wonder if it is reasonable to proceed with variable selection such as stepwise in the linear model and perform BLP by constructing a matrix with only the selected variables.

Result for BLP with all variables

Best linear projection of the conditional average treatment effect.
Confidence intervals are cluster- and heteroskedasticity-robust (HC3):

               Estimate  Std. Error t value Pr(>|t|)  
(Intercept) -8.3923e-01  3.4658e-01 -2.4215  0.01584 *
PRAPACHE     1.7305e-02  9.8373e-03  1.7591  0.07923 .
AGE          7.5675e-03  3.0559e-03  2.4764  0.01363 *
BLGCS       -1.3452e-02  1.5681e-02 -0.8578  0.39143  
ORGANNUM     2.9168e-02  6.1689e-02  0.4728  0.63656  
BLIL6        7.4330e-07  5.6685e-07  1.3113  0.19042  
BLLPLAT      4.7364e-04  3.4598e-04  1.3690  0.17168  
BLLBILI      1.0151e-02  1.2267e-02  0.8276  0.40836  
BLLCREAT    -4.9351e-03  1.2213e-02 -0.4041  0.68635  
TIMFIRST     5.8356e-05  4.6491e-05  1.2552  0.21005  
BLADL       -1.0051e-02  1.2494e-02 -0.8045  0.42154  
blSOFA      -4.6013e-04  2.2480e-02 -0.0205  0.98368  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Result for BLP with interested variables

Best linear projection of the conditional average treatment effect.
Confidence intervals are cluster- and heteroskedasticity-robust (HC3):

              Estimate Std. Error t value  Pr(>|t|)    
(Intercept) -0.8888329  0.2125291 -4.1822 3.449e-05 ***
PRAPACHE     0.0222930  0.0069279  3.2179  0.001381 ** 
AGE          0.0066299  0.0028854  2.2977  0.022020 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

In result of 'Result for BLP with interested variables', I interpreted the result as "When Age increases by one unit, CATE increases by 0.0066 and is statistically significant." I wonder if this interpretation is valid.

Best regards.

Feb 07 '23 02:02 Einspanner8888

Hi @Einspanner8888, You can interpret the BLP as just another linear regression, but that under the hood uses a more involved construction for the LHS. Thus doing your favorite stepwise selection procedure, and interpreting coefficients as you would with an OLS regression, is perfectly reasonable (but of course the usual considerations around multiple selection and inference post model selection apply).

Mar 22 '23 17:03 erikcs

grf grf copied to clipboard

Interpretation for best linear projection

grf
grf copied to clipboard