estimatr
estimatr copied to clipboard
NA coefficients lead to NA F-statistic
I imagine this is pretty low on the priority list, but if you have coefficients that are NA
, such as by inducing perfect collinearity with weights, the F-statistic is not calculated in either lm_robust
or iv_robust
, even though lm
handles it fine. There are some strange quirks when you play around with this:
library(tidyverse)
library(estimatr)
o <- 300
tb <- tibble(w = rnorm(o),
group = rep(c('A','B','C'),o/3),
#note inclusion of 0 weights that line up with group identifier
weights = rep(c(0,1,2),o/3),
z = rnorm(o),
nu = rnorm(o),
eps = rnorm(o)) %>%
mutate(x = z*2 + w + nu) %>%
mutate(y = x*3 + w + eps)
#Successful coefficient and f-statistic calculation
#Interactions in first (iv) and/or second (lm, iv) stages, with collinear term dropped, ok
summary(lm(y~x+factor(group),data=tb))
summary(lm_robust(y~x+factor(group),data=tb,se_type='classical'))
summary(iv_robust(y~x+factor(group)|z+factor(group),data=tb,se_type='classical',diagnostics=TRUE))
#zero weights along with collinear term being dropped, ok.
summary(lm(y~x,data=tb,weights=weights))
summary(lm_robust(y~x,data=tb,weights=weights,se_type='classical',))
summary(iv_robust(y~x|z,data=tb,weights=weights,se_type='classical',diagnostics = TRUE))
#When the collinearity is introduced because of the zero weights
#(in this case, a second dummy should be dropped)
#lm reports NA coefficients/se/etc estimates instead of dropping, but F is fine
summary(lm(y~x+factor(group),data=tb,weights=weights))
summary(lm(y~x*factor(group),data=tb,weights=weights))
#Under lm_robust, F statistic is fine in the additive version below but not
#the interaction version
summary(lm_robust(y~x+factor(group),data=tb,weights=weights))
summary(lm_robust(y~x*factor(group),data=tb,weights=weights))
#Under IV, additive version produces F for first stage but not second
summary(iv_robust(y~factor(group)+x|z+factor(group),data=tb,weights=weights,se_type='classical',diagnostics=TRUE))
#...but for some reason if it's only a problem in the first, first doesn't work
summary(iv_robust(y~x|z+factor(group),data=tb,weights=weights,se_type='classical',diagnostics=TRUE))
#and under IV, interaction breaks both first and second stage
summary(iv_robust(y~x*factor(group)|z*factor(group),data=tb,weights=weights,se_type='classical',diagnostics=TRUE))
#if it's only a problem in the first stage, second stage is fine
summary(iv_robust(y~x|z*factor(group),data=tb,weights=weights,se_type='classical',diagnostics=TRUE))
Thanks, the f-test code isn't my finest hour so this is a good reason to revisit it.
Checked in estimatr 0.20.0, bug is still there (as I'd expect given the issue is open, but figured I'd check).
Sorry Nick, last patch was some major problems facing all users. We're triaging now.
Oh no problem. Just thought I'd check since the new version was out.