estimatr icon indicating copy to clipboard operation
estimatr copied to clipboard

clustering above assignment level

Open macartan opened this issue 6 years ago • 2 comments

Treatment needs to be nested within clusters for differences_in_means but not for lm_robust

data <- fabricate(12, Z = rep(0:1, 6), X = rep(0:1, each = 6), Y = rnorm(12))

lm_robust(Y ~ Z, cluster = X, data = data)
difference_in_means(Y ~ Z, cluster = X, data = data)

Should we not have the same behavior for both?

> lm_robust(Y ~ Z, cluster = X, data = data)
              Estimate Std. Error    t value  Pr(>|t|)  CI Lower CI Upper DF
(Intercept) -0.5717880  0.6304287 -0.9069829 0.5310279 -8.582144 7.438568  1
Z            0.5346832  0.4728588  1.1307461 0.4609849 -5.473557 6.542923  1
> difference_in_means(Y ~ Z, cluster = X, data = data)
Error in difference_in_means_internal(condition1 = condition1, condition2 = condition2,  : 
  All units within a cluster must have the same treatment condition.

macartan avatar Dec 18 '18 12:12 macartan

I'm happy to remove the constraint in difference_in_means() because it is just kicking to lm_robust() in the clustered case anyways.

I don't believe I have access to it today, but the simplification of the CR2 estimator for the case with a single binary predictor and equal sized clusters (e.g. the commonly-used se(SATE) estimator that is presented in GG on p.83), requires that treatment be unique within cluster.

If we removed this constraint, we should still match the GG estimator in the case with equal sized clusters and unique treatment within cluster. We would simply be allowing another case that the standard clustered DiM estimator could not accommodate.

lukesonnet avatar Dec 18 '18 14:12 lukesonnet

The behavior of the difference_in_means command is still like this. I think this is confusing especially since the documentation states that lm_robust and difference_in_means are the same in the clustered case.

pmeiners avatar Jul 06 '21 13:07 pmeiners