etwfe
etwfe copied to clipboard
Add Wild Cluster Bootstrap Support
This PR adds support for inference via a wild (cluster) bootstrap by adding a bootstrap
argument to etwfe
(only for OLS). If bootstrap = TRUE
, etwfe will compute marginal effects by calling fwildclusterboot::boot_aggregate()
, which is a copy of fixest::aggregate()
.
It currently depends on a fork of fwildclusterboot
, which in itself depends on a fork of fixest
by @kylebutts, which introduces support or sparse model matrices. In other words, merging this PR will require another PR to be merged into fixest.
At the moment, this PR simply
- adds a
bootstrap
argument toemfx
. Ifbootstrap = TRUE
, it will run a wild cluster bootstrap via thefwildclusterboot
package - in consequence,
fwildclusterboot
is added as a (soft) dependency inSuggests
- at the moment, only
type = "simple"
and the "clustered" bootstrap are supported
The PR still
- [ ] ...requires @kylebutts's PR to be merged into
fixest
- [ ] ... and
fwildclusterboot
being updated afterwards - [ ] ... lacks support for the heteroskedastic bootstrap
- [ ] .... lacks some defensive checks
- [ ] ... lacks unit tests
- [ ] ... lacks documentation in the vignette
- [ ] I will also have to revert all changes to
etwfe
(it's only white space changes, sorry about that).
It is also worth discussing how to unify the output, i.e. running marginaleffects
will return a marginaleffects
object, while running the bootstrap will simply return a data.frame
.
Here is some example code:
library(devtools)
install_github("https://github.com/s3alfisc/fwildclusterboot/tree/etwfe-support")
# this should install kyle's fork of fixest, if not, do it manually
#install_github("https://github.com/kylebutts/fixest/tree/sparse-matrix")
library(etwfe)
library(fwildclusterboot)
data("mpdta", package="did")
mod = etwfe(
fml = lemp ~ lpop,
tvar = year,
gvar = first.treat,
data = mpdta,
#se = "hetero",
vcov = ~countyreal,
ssc = fixest::ssc(adj = FALSE, cluster.adj = FALSE)
)
#names(coef(mod))
emfx(mod)
# Term Contrast .Dtreat Estimate Std. Error
# .Dtreat mean(TRUE) - mean(FALSE) TRUE -0.0506 0.0124
# z Pr(>|z|) S 2.5 % 97.5 %
# -4.08 <0.001 14.4 -0.075 -0.0263
emfx(mod, bootstrap = TRUE, B = 99999, nthreads = 2)
# Run the wild bootstrap: this might take some time...(but hopefully not too much time =) ).
# |======================================================| 100% Estimate t value Pr(>|t|) [0.025% 0.975%]
# [1,] -0.05062703 -4.078845 6.00006e-05 -0.07550813 -0.02580929
# Warning messages:
# 1: In emfx(mod, bootstrap = TRUE, B = 99999, nthreads = 2) :
# The bootstrap does not support the ssc() argument `fixef.K='none'`. Using `fixef.K='none' instead. This will lead to a slightly different non-bootstrapped t-statistic`, but will not affect bootstrapped p-values and CIs.
# 2: Matrix inversion failure: Using a generalized inverse instead.
# Check the produced t-statistic, does it match the one of your
# regression package (under the same small sample correction)? If
# yes, this is likely not something to worry about.
@jtorcasso fyi