fixest
fixest copied to clipboard
Feature request: VCV for multiple outcomes
Laurent -- thanks again for the fantastic package!
I think it would be great if fixest could return the joint variance-covariance matrix when estimating multiple regressions at once. I.e., I run
feols(c(y1,y2) ~ x,...)
Now I'd like to know the covariance between the coefficient on x when using y1 as an outcome and y2 as an outcome. (This is what's sometime called seeming unrelated regression.) There are some wrap-arounds available (e.g. here), but it would be great if this could easily be done natively.
@jonathandroth, here's a basic version (no clustering). Two questions and I can make this more robust:
- What is the correct way to account for degrees of freedom? Is it sum of number of rows in each est - sum of number of covariates in each est?
- Would clustered standard errors be useful? I believe you would do sum $X_i' e_i$ for each g and premultiply by $(X'X)^{-1}$ to get a $N_g$ by $k$ matrix of influence functions (instead of N by k).
library(fixest)
#' Get joint variance-covariance matrix of seemingly unrelated regressions
#'
#' @param ests List of `feols` estimates.
#'
#' @return Variance-covariance matrix of the combined set of coefficients in
#' all models in `ests`.
#'
vcovSUR <- function(ests) {
if (inherits(ests, "fixest") | inherits(ests, "fixest_multi")) {
ests = list(ests)
}
inf_list <- lapply(ests, get_influence_func)
inf_funcs <- do.call("rbind", inf_list)
vcov = tcrossprod(inf_funcs)
return(vcov)
}
get_influence_func <- function(est) {
# (X'X)^{-1} X'e
if (inherits(est, "fixest_multi")) {
inf_list = lapply(est, function(x) {
(x$cov.iid / x$sigma2) %*% t(x$scores)
})
do.call("rbind", inf_list)
} else if (inherits(est, "fixest")) {
(est$cov.iid / est$sigma2) %*% t(est$scores)
} else { # code taken partially from `sandwich`
X <- stats::model.matrix(est)
if (any(alias <- is.na(stats::coef(est)))) X <- X[, !alias, drop = FALSE]
if (!is.null(w <- stats::weights(est))) X <- X * sqrt(w)
scores = X * stats::resid(est)
solve(crossprod(X), t(scores))
}
}
mtcars$w = runif(nrow(mtcars), 0.15, 0.85)
est1 <- feols(mpg ~ wt, mtcars)
est2 <- feols(mpg ~ wt + i(cyl), mtcars)
est3 <- lm(mpg ~ wt + i(cyl), mtcars)
est_w <- feols(mpg ~ wt + i(cyl), mtcars, weights = ~w)
est_multi <- feols(c(mpg, hp) ~ wt + i(cyl), mtcars)
vcov(est1, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE))
#> (Intercept) wt
#> (Intercept) 4.516833 -1.305717
#> wt -1.305717 0.401577
vcov(est2, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE))
#> (Intercept) wt cyl::6 cyl::8
#> (Intercept) 3.2700821 -0.9321293 -0.3715946 0.3477177
#> wt -0.9321293 0.3783737 -0.2442125 -0.5329499
#> cyl::6 -0.3715946 -0.2442125 1.2716154 1.3082462
#> cyl::8 0.3477177 -0.5329499 1.3082462 1.9913114
vcovSUR(list(est1, est2))
#> (Intercept) wt (Intercept) wt cyl::6
#> (Intercept) 4.5168333 -1.30571703 3.3911740 -0.8658345 -0.6364905
#> wt -1.3057170 0.40157696 -0.9446715 0.2607989 0.1249679
#> (Intercept) 3.3911740 -0.94467154 3.2700821 -0.9321293 -0.3715946
#> wt -0.8658345 0.26079887 -0.9321293 0.3783737 -0.2442125
#> cyl::6 -0.6364905 0.12496790 -0.3715946 -0.2442125 1.2716154
#> cyl::8 -0.2795933 0.06152524 0.3477177 -0.5329499 1.3082462
#> cyl::8
#> (Intercept) -0.27959331
#> wt 0.06152524
#> (Intercept) 0.34771768
#> wt -0.53294991
#> cyl::6 1.30824624
#> cyl::8 1.99131136
# Works with weights
vcov(est_w, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE))
#> (Intercept) wt cyl::6 cyl::8
#> (Intercept) 5.0756301 -1.12152509 -1.59657428 -0.4062194
#> wt -1.1215251 0.33655783 0.07906278 -0.2880801
#> cyl::6 -1.5965743 0.07906278 1.47597744 1.2937823
#> cyl::8 -0.4062194 -0.28808008 1.29378231 1.9053256
vcovSUR(est_w)
#> (Intercept) wt cyl::6 cyl::8
#> (Intercept) 5.0756301 -1.12152509 -1.59657428 -0.4062194
#> wt -1.1215251 0.33655783 0.07906278 -0.2880801
#> cyl::6 -1.5965743 0.07906278 1.47597744 1.2937823
#> cyl::8 -0.4062194 -0.28808008 1.29378231 1.9053256
# Works with multi
lapply(est_multi, function(est) vcov(est, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE)))
#> $`lhs: mpg`
#> (Intercept) wt cyl::6 cyl::8
#> (Intercept) 3.2700821 -0.9321293 -0.3715946 0.3477177
#> wt -0.9321293 0.3783737 -0.2442125 -0.5329499
#> cyl::6 -0.3715946 -0.2442125 1.2716154 1.3082462
#> cyl::8 0.3477177 -0.5329499 1.3082462 1.9913114
#>
#> $`lhs: hp`
#> (Intercept) wt cyl::6 cyl::8
#> (Intercept) 481.0195 -189.14317 135.81828 479.2033
#> wt -189.1432 80.30526 -73.10224 -221.1762
#> cyl::6 135.8183 -73.10224 174.73028 230.6635
#> cyl::8 479.2033 -221.17619 230.66346 730.4108
vcovSUR(est_multi)
#> (Intercept) wt cyl::6 cyl::8 (Intercept)
#> (Intercept) 3.2700821 -0.9321293 -0.3715946 0.3477177 -3.6102659
#> wt -0.9321293 0.3783737 -0.2442125 -0.5329499 0.2846401
#> cyl::6 -0.3715946 -0.2442125 1.2716154 1.3082462 2.1105066
#> cyl::8 0.3477177 -0.5329499 1.3082462 1.9913114 -1.6623714
#> (Intercept) -3.6102659 0.2846401 2.1105066 -1.6623714 481.0194872
#> wt 0.2846401 -0.2300793 0.7005150 2.4442424 -189.1431698
#> cyl::6 2.1105066 0.7005150 -6.1116792 -6.4158361 135.8182788
#> cyl::8 -1.6623714 2.4442424 -6.4158361 -13.5626121 479.2033017
#> wt cyl::6 cyl::8
#> (Intercept) 0.2846401 2.110507 -1.662371
#> wt -0.2300793 0.700515 2.444242
#> cyl::6 0.7005150 -6.111679 -6.415836
#> cyl::8 2.4442424 -6.415836 -13.562612
#> (Intercept) -189.1431698 135.818279 479.203302
#> wt 80.3052563 -73.102243 -221.176193
#> cyl::6 -73.1022434 174.730281 230.663464
#> cyl::8 -221.1761929 230.663464 730.410817
Created on 2023-07-10 with reprex v2.0.2
@grantmcdermott @vincentarelbundock, is there a way to call marginaleffects::hypothesis
providing a vector of coefficients and the variance-covariance matrix?
@kylebutts maybe the FUN argument in hypotheses() can help. (Not sure.)
Thanks, @kylebutts !
Re d.o.f., Stata's sureg command has the following options:
Yes, I do think clustered SEs would be a nice add-on.
@vincentarelbundock That worked with defining custom get_coef
and set_coef
methods! Thanks!
Cool! Is that too niche, or is it something that should be merged in the package, you think?
@jonathandroth See https://github.com/kylebutts/vcovSUR :-)
✔️ Variance-covariance matrix estimation
✔️ Clustering (which I realized is necessary for clustering by observation id)
✔️ Support for fixest_multi
objects
✔️ Hypothesis Tests from {marginaleffects}
✖︎ No small-sample degree of freedom correction (yet).
Awesome! My own 2 cents is it would be useful to have a native fixest feature for doing hypothesis tests when using multiple outcomes, but I'll leave it up to the package development experts here. Either way this will be a great resource
On Wed, Jul 12, 2023 at 11:18 AM Kyle F Butts @.***> wrote:
@jonathandroth https://github.com/jonathandroth See https://github.com/kylebutts/vcovSUR :-)
✔️ Variance-covariance matrix estimation ✔️ Clustering (which I realized is necessary for clustering by observation id) ✔️ Support for fixest_multi objects ✔️ Hypothesis Tests from {marginaleffects} ✖︎ No small-sample degree of freedom correction (yet).
— Reply to this email directly, view it on GitHub https://github.com/lrberge/fixest/issues/424#issuecomment-1632736446, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE6EXFCGYE76O27U4ZRVGWDXP252VANCNFSM6AAAAAAZR2EXXE . You are receiving this because you were mentioned.Message ID: @.***>