grf
grf copied to clipboard
Is GRF sensitive to outliers and collinear covariates?
trafficstars
hi, I am wondering whether grf ATE calculation and test_calibration methods is sensitive to data outliers and collinear covariates?
causal_forest estimates a difference in conditional means E[Y(1) - Y(0) | X], a mean is sensitive to outliers, so yes, causal_forest estimates can be sensitive to large outliers in Y.
GRF is random forest based, so duplicate/collinear covariates makes no difference as it will just split on whichever of them, if any, yields the lowest loss.
@erikcs thanks, is there any criteria for us to classify data outliers?