WildBootTests.jl
WildBootTests.jl copied to clipboard
Hypotheses with q > 2
Hi David - quick feedback on hypotheses with q > 1: Joint tests of q = 1 or q = 2 hypotheses work fine (and t-stats/F-stats are identical to another R implementation, clubSandwich), but for more than three hypotheses, I receive an error message:
Update: all examples below are produced with the current release version of WildBootTests.jl.
Part 1: test hypotheses with q = 1 and q = 2
library(JuliaConnectoR)
library(clubSandwich)
startJuliaServer()
WildBootTests <- juliaImport("WildBootTests")
df <- read.csv(file = 'https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/sandwich/PetersenCL.csv')
R <- matrix(c(0,1), nrow=1); r <- 0
test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm)
test
WildBootTests$teststat(test) # 20.45298
WildBootTests$p(test) # 0
WildBootTests$CI(test) # 0.9321786 1.137515
lm_fit <- lm(y ~ x, data = df)
R <- clubSandwich::constrain_zero(constraints = 2, coefs = coef(lm_fit)) # [0 1]
clubSandwich::coef_test(obj = lm_fit, vcov= clubSandwich::vcovCR(obj = lm_fit,
cluster = df$firm,
type = "CR1S"))
# Coef. Estimate SE t-stat d.f. p-val (Satt) Sig.
# 1 (Intercept) 0.0297 0.0670 0.443 499 0.658
# 2 x 1.0348 0.0506 20.453 310 <0.001 ***
# now wald_test
R <- clubSandwich::constrain_zero(constraints = 1:2, coefs = coef(lm_fit))
# [,1] [,2]
# [1,] 1 0
# [2,] 0 1
r <- c(0, 0)
clubSandwich::Wald_test(obj = lm_fit,
constraints = R,
vcov= clubSandwich::vcovCR(obj = lm_fit,
cluster = df$firm,
type = "CR1S"))
# test Fstat df_num df_denom p_val sig
# HTZ 209 2 359 <0.001 ***
test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm)
WildBootTests$teststat(test) # [1] 209.5094
WildBootTests$CI(test) # <0 x 0 matrix>
q = 3
lm_fit <- lm(y ~ x + year, data = df)
R <- clubSandwich::constrain_zero(constraints = 1:3, coefs = coef(lm_fit))
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 0 1 0
# [3,] 0 0 1
r <- rep(0, 3)
test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x, df$year), clustid=df$firm)
#' Error: Evaluation in Julia failed.
#' Original Julia error message:
#' TaskFailedException
#' Stacktrace:
#' [1] wait
#' @ .\task.jl:334 [inlined]
#' [2] threading_run(func::Function)
#' @ Base.Threads .\threadingconstructs.jl:38
#' [3] macro expansion
#' @ .\threadingconstructs.jl:97 [inlined]
#' [4] MakeNonWRELoop1!
#' @ C:\Users\alexa\.julia\packages\WildBootTests\HiwPV\src\nonWRE.jl:224 [inlined]
#' [5] MakeNonWREStats!(o::WildBootTests.StrBootTest{Float32}, w::Int64)
#' @ WildBootTests C:\Users\alexa\.julia\packages\WildBootTests\HiwPV\src\nonWRE.jl:263
#' [6] boottestOLSARubin!(o::WildBootTests.StrBootTest{Float32})
#' @ WildBootTests C:\Users\alexa\.julia\packages\WildBootTests\HiwPV\src\WildBootTests.jl:35
#' [7] getp(o::WildBootTests.StrBootTest{Float32})
#' @ WildBootTests C:\Users\alexa\.julia\packages\WildBootTests\HiwPV\src\StrBootTest.jl:297
#' [8] __wildboottest(R::Matrix{Float32}, r::Vector{Float32}; resp::Vector{Float32}, predexog::Matrix{Float32}, predendog::Matrix{Float32}, i
Something I forgot to ask in the post above - for q = 2, does WildBootTests.boottest() compute confidence sets as Stata.boottest? In the q = 2 example above, WildBootTests$CI(test) returns a '0x0' Julia matrix.
Excellent. I believe I have fixed the bug. I'm in the middle of some complicated rewriting of the WRE in the main branch, so it would be easier not to put out a new release yet. But let me know if the wait becomes a problem.
The Stata boottest does not produce CIs for q=2. It does offer to plot a confidence surface however, and wildboottests() does the same. See the confidence surface example at https://droodman.github.io/WildBootTests.jl/dev/OLSexamples/#Further-examples.
I did not really think my question through before asking - it makes good sense that there are no CIs for q = 2. Regarding the bug fix, I can develop wildboottestjlr with q = 2 and test for q > 2 once you release the new version.
I'm sorry to report that I believe that hypotheses with q = 2 don't match clubSandwich. For q = 1, the match is exact:
q = 1
library(JuliaConnectoR)
library(clubSandwich)
startJuliaServer()
WildBootTests <- juliaImport("WildBootTests")
df <- read.csv(file = 'https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/sandwich/PetersenCL.csv')
lm_fit <- lm(y ~ x, data = df)
# q = 1
R <- clubSandwich::constrain_zero(constraints = 2, coefs = coef(lm_fit))
R
r <- c(0)
clubSw <-
clubSandwich::Wald_test(obj = lm_fit,
constraints = R,
vcov= clubSandwich::vcovCR(obj = lm_fit,
cluster = df$firm,
type = "CR0"))
teststat_clubSw <- clubSw$Fstat
test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm, small = F)
teststat_wild <- WildBootTests$teststat(test)
sqrt(teststat_clubSw) # 20.47551
teststat_wild # 20.47551
q >= 2
No small sample adjustments:
R <- clubSandwich::constrain_zero(constraints = 1:2, coefs = coef(lm_fit))
R
r <- c(0, 0)
clubSw <-
clubSandwich::Wald_test(obj = lm_fit,
constraints = R,
vcov= clubSandwich::vcovCR(obj = lm_fit,
cluster = df$firm,
type = "CR0"))
teststat_clubSw <- clubSw$Fstat
test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm, small = F)
teststat_wild <- WildBootTests$teststat(test)
teststat_clubSw # 209.389
teststat_wild # 419.9426
With small sample adjustments, the test statistics get closer but don't match:
# standard correction (N-1) 7 (N-k) x G / (G - 1)
clubSw <-
clubSandwich::Wald_test(obj = lm_fit,
constraints = R,
vcov= clubSandwich::vcovCR(obj = lm_fit,
cluster = df$firm,
type = "CR1S"))
teststat_clubSw <- clubSw$Fstat
test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm)
teststat_wild <- WildBootTests$teststat(test)
teststat_clubSw # 208.9284
teststat_wild # 209.5094
Above, both statistics are rounded to 209.
Alexander, I think most of what is going on here is that without the small-sample adjustment, wildboottest() is reporting a chi2 instead of an F statistic. The F statistic is divided by the degree of the test (here, 2), but the chi2 is not. You can get the type of statistic with WildBootTests$stattype(test). So this looks OK to me.
I believe this can be closed - the very small discrepancies in F-stats between WildBootTests.jl & clubSandwich was due to a function argument in clubSandwich::Wald_test() that I missed (sorry for that!):
library(JuliaConnectoR)
library(clubSandwich)
startJuliaServer()
WildBootTests <- juliaImport("WildBootTests")
df <- read.csv(file = 'https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/sandwich/PetersenCL.csv')
lm_fit <- lm(y ~ x, data = df)
R <- clubSandwich::constrain_zero(constraints = 1:2, coefs = coef(lm_fit))
R
r <- c(0, 0)
clubSw <-
clubSandwich::Wald_test(obj = lm_fit,
constraints = R,
vcov= clubSandwich::vcovCR(obj = lm_fit,
cluster = df$firm,
type = "CR1S"),
test = "Naive-F")
teststat_clubSw <- clubSw$Fstat
test <- WildBootTests$wildboottest("Float64", R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm)
teststat_wild <- WildBootTests$teststat(test)
teststat_clubSw # 209.5096
teststat_wild # 209.5096
Justi FYI, you can find my tests for 'deterministic' equality of Wald Statistics computed via WildBootTests.jl against fixest::wald() here : https://github.com/s3alfisc/fwildclusterboot/blob/develop/tests/testthat/test-tstat_equivalence.R#L176 - they all pass (under v.0.7.6) :)