estimatr
estimatr copied to clipboard
Return first-stage in iv_robust
It's common to see the coefficients from the first-stage regression regression in a 2SLS regression table. For example, see discussion here.
estimatr::iv_robust
does not currently support this AFAIK. (Although it does return some overall diagnostic results from the first-stage if the "diagnostics = T" argument is used.) Would it be possible add the first-stage to the model return object?
FWIW, lfe::felm
supports this with a "stage1" return object. Here's a reprex:
# library(AER) ## only for Cigarettes dataset
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(lfe))
## Get the data
data("CigarettesSW", package = "AER")
## Create a new data frame with some modified variables
cigs <-
CigarettesSW %>%
mutate(
rprice = price/cpi,
rincome = income/population/cpi,
rtax = tax/cpi,
tdiff = (taxs - tax)/cpi
)
## Run the iv regression in felm with tdiff and rtax instrumenting the endogenous
## variable log(rprice)
iv_felm <-
felm(
log(packs) ~ log(rincome) |
year + state | ## FEs
(log(rprice) ~ tdiff + rtax), ## Endog. variable and instruments
data = cigs
)
## Shown first stage result
summary(iv_felm$stage1)
#>
#> Call:
#> NULL
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.06233 -0.01529 0.00000 0.01529 0.06233
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> log(rincome) -0.028994 0.147492 -0.197 0.845
#> tdiff 0.013457 0.003050 4.412 6.52e-05 ***
#> rtax 0.007573 0.001049 7.221 5.43e-09 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.03064 on 44 degrees of freedom
#> Multiple R-squared(full model): 0.9815 Adjusted R-squared: 0.9601
#> Multiple R-squared(proj model): 0.7779 Adjusted R-squared: 0.5204
#> F-statistic(full model):45.85 on 51 and 44 DF, p-value: < 2.2e-16
#> F-statistic(proj model): 51.36 on 3 and 44 DF, p-value: 2.015e-14
#> F-statistic(excl instr.):75.65 on 2 and 44 DF, p-value: 5.758e-15
Created on 2019-11-07 by the reprex package (v0.3.0)
Thanks very much for this. I can see a nice argument for returning the reduced form and first stage regressions as additional entries in the iv_robust object. Might be cool if then we could have
tidy(iv_robust_fit, model = "first_stage")
tidy(iv_robust_fit, model = "reduced_form")
tidy(iv_robust_fit, model = "second_stage")
or similar?
@acoppock That looks great to me.
@acoppock I would recommend just adding the first stage as a named element on the iv_robust_fit object, and not directing people to specify non-standard options on tidy / not also supporting summary.
Here is what lfe does for tidy:
> tidy(iv_felm)
# A tibble: 2 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 log(rincome) 0.462 0.308 1.50 0.141
2 `log(rprice)(fit)` -1.20 0.171 -7.02 0.00000000940
> tidy(iv_felm$stage1)
# A tibble: 3 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 log(rincome) -0.0290 0.147 -0.197 0.845
2 tdiff 0.0135 0.00305 4.41 0.0000652
3 rtax 0.00757 0.00105 7.22 0.00000000543
Thanks Neal, that's really helpful. Nice way to do both.
can i upvote that issue? :-) thanks
Any news on this?