WildBootTests.jl icon indicating copy to clipboard operation
WildBootTests.jl copied to clipboard

Hypotheses with q > 2

Open s3alfisc opened this issue 3 years ago • 7 comments

Hi David - quick feedback on hypotheses with q > 1: Joint tests of q = 1 or q = 2 hypotheses work fine (and t-stats/F-stats are identical to another R implementation, clubSandwich), but for more than three hypotheses, I receive an error message:

Update: all examples below are produced with the current release version of WildBootTests.jl.

Part 1: test hypotheses with q = 1 and q = 2



library(JuliaConnectoR)
library(clubSandwich)

startJuliaServer()

WildBootTests <- juliaImport("WildBootTests")
df <- read.csv(file = 'https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/sandwich/PetersenCL.csv')
R <- matrix(c(0,1), nrow=1); r <- 0

test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm)
test
WildBootTests$teststat(test) # 20.45298
WildBootTests$p(test) # 0
WildBootTests$CI(test) # 0.9321786 1.137515

lm_fit <- lm(y ~ x, data = df)
R <- clubSandwich::constrain_zero(constraints = 2, coefs = coef(lm_fit)) # [0 1]
clubSandwich::coef_test(obj = lm_fit, vcov= clubSandwich::vcovCR(obj = lm_fit,
                                                                 cluster = df$firm,
                                                                 type = "CR1S"))
# Coef. Estimate     SE t-stat d.f. p-val (Satt) Sig.
# 1 (Intercept)   0.0297 0.0670  0.443  499        0.658     
# 2           x   1.0348 0.0506 20.453  310       <0.001  ***


# now wald_test
R <- clubSandwich::constrain_zero(constraints = 1:2, coefs = coef(lm_fit)) 
# [,1] [,2]
# [1,]    1    0
# [2,]    0    1
r <- c(0, 0)

clubSandwich::Wald_test(obj = lm_fit, 
                        constraints = R, 
                        vcov= clubSandwich::vcovCR(obj = lm_fit,
                                                   cluster = df$firm,
                                                   type = "CR1S"))
# test Fstat df_num df_denom  p_val sig
# HTZ   209      2      359 <0.001 ***
  
test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm)
WildBootTests$teststat(test) # [1] 209.5094
WildBootTests$CI(test) # <0 x 0 matrix>

q = 3


lm_fit <- lm(y ~ x + year, data = df)
R <- clubSandwich::constrain_zero(constraints = 1:3, coefs = coef(lm_fit)) 
# [,1] [,2] [,3]
# [1,]    1    0    0
# [2,]    0    1    0
# [3,]    0    0    1
r <- rep(0, 3)

test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x, df$year), clustid=df$firm)

#' Error: Evaluation in Julia failed.
#' Original Julia error message:
#'   TaskFailedException
#' Stacktrace:
#'   [1] wait
#' @ .\task.jl:334 [inlined]
#' [2] threading_run(func::Function)
#' @ Base.Threads .\threadingconstructs.jl:38
#' [3] macro expansion
#' @ .\threadingconstructs.jl:97 [inlined]
#' [4] MakeNonWRELoop1!
#'   @ C:\Users\alexa\.julia\packages\WildBootTests\HiwPV\src\nonWRE.jl:224 [inlined]
#' [5] MakeNonWREStats!(o::WildBootTests.StrBootTest{Float32}, w::Int64)
#' @ WildBootTests C:\Users\alexa\.julia\packages\WildBootTests\HiwPV\src\nonWRE.jl:263
#' [6] boottestOLSARubin!(o::WildBootTests.StrBootTest{Float32})
#' @ WildBootTests C:\Users\alexa\.julia\packages\WildBootTests\HiwPV\src\WildBootTests.jl:35
#' [7] getp(o::WildBootTests.StrBootTest{Float32})
#' @ WildBootTests C:\Users\alexa\.julia\packages\WildBootTests\HiwPV\src\StrBootTest.jl:297
#' [8] __wildboottest(R::Matrix{Float32}, r::Vector{Float32}; resp::Vector{Float32}, predexog::Matrix{Float32}, predendog::Matrix{Float32}, i
                   


s3alfisc avatar Jan 15 '22 12:01 s3alfisc

Something I forgot to ask in the post above - for q = 2, does WildBootTests.boottest() compute confidence sets as Stata.boottest? In the q = 2 example above, WildBootTests$CI(test) returns a '0x0' Julia matrix.

s3alfisc avatar Jan 15 '22 13:01 s3alfisc

Excellent. I believe I have fixed the bug. I'm in the middle of some complicated rewriting of the WRE in the main branch, so it would be easier not to put out a new release yet. But let me know if the wait becomes a problem.

The Stata boottest does not produce CIs for q=2. It does offer to plot a confidence surface however, and wildboottests() does the same. See the confidence surface example at https://droodman.github.io/WildBootTests.jl/dev/OLSexamples/#Further-examples.

droodman avatar Jan 15 '22 14:01 droodman

I did not really think my question through before asking - it makes good sense that there are no CIs for q = 2. Regarding the bug fix, I can develop wildboottestjlr with q = 2 and test for q > 2 once you release the new version.

s3alfisc avatar Jan 15 '22 16:01 s3alfisc

I'm sorry to report that I believe that hypotheses with q = 2 don't match clubSandwich. For q = 1, the match is exact:

q = 1

library(JuliaConnectoR)
library(clubSandwich)

startJuliaServer()

WildBootTests <- juliaImport("WildBootTests")
df <- read.csv(file = 'https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/sandwich/PetersenCL.csv')
lm_fit <- lm(y ~ x, data = df)


# q = 1
R <- clubSandwich::constrain_zero(constraints = 2, coefs = coef(lm_fit)) 
R
r <- c(0)

clubSw <- 
  clubSandwich::Wald_test(obj = lm_fit, 
                          constraints = R, 
                          vcov= clubSandwich::vcovCR(obj = lm_fit,
                                                     cluster = df$firm,
                                                     type = "CR0"))
teststat_clubSw <- clubSw$Fstat

test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm, small = F)
teststat_wild <- WildBootTests$teststat(test) 

sqrt(teststat_clubSw) # 20.47551
teststat_wild # 20.47551

q >= 2

No small sample adjustments:

R <- clubSandwich::constrain_zero(constraints = 1:2, coefs = coef(lm_fit)) 
R
r <- c(0, 0)

clubSw <- 
clubSandwich::Wald_test(obj = lm_fit, 
                        constraints = R, 
                        vcov= clubSandwich::vcovCR(obj = lm_fit,
                                                   cluster = df$firm,
                                                   type = "CR0"))
teststat_clubSw <- clubSw$Fstat

test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm, small = F)
teststat_wild <- WildBootTests$teststat(test) 

teststat_clubSw # 209.389
teststat_wild # 419.9426

With small sample adjustments, the test statistics get closer but don't match:

# standard correction (N-1) 7 (N-k) x G / (G - 1)

clubSw <- 
  clubSandwich::Wald_test(obj = lm_fit, 
                          constraints = R, 
                          vcov= clubSandwich::vcovCR(obj = lm_fit,
                                                     cluster = df$firm,
                                                     type = "CR1S"))
teststat_clubSw <- clubSw$Fstat

test <- WildBootTests$wildboottest(R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm)
teststat_wild <- WildBootTests$teststat(test) 

teststat_clubSw # 208.9284
teststat_wild # 209.5094

Above, both statistics are rounded to 209.

s3alfisc avatar Jan 16 '22 18:01 s3alfisc

Alexander, I think most of what is going on here is that without the small-sample adjustment, wildboottest() is reporting a chi2 instead of an F statistic. The F statistic is divided by the degree of the test (here, 2), but the chi2 is not. You can get the type of statistic with WildBootTests$stattype(test). So this looks OK to me.

droodman avatar Jan 16 '22 19:01 droodman

I believe this can be closed - the very small discrepancies in F-stats between WildBootTests.jl & clubSandwich was due to a function argument in clubSandwich::Wald_test() that I missed (sorry for that!):

library(JuliaConnectoR)
library(clubSandwich)

startJuliaServer()

WildBootTests <- juliaImport("WildBootTests")
df <- read.csv(file = 'https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/sandwich/PetersenCL.csv')
lm_fit <- lm(y ~ x, data = df)

R <- clubSandwich::constrain_zero(constraints = 1:2, coefs = coef(lm_fit)) 
R
r <- c(0, 0)

clubSw <- 
  clubSandwich::Wald_test(obj = lm_fit, 
                          constraints = R, 
                          vcov= clubSandwich::vcovCR(obj = lm_fit,
                                                     cluster = df$firm,
                                                     type = "CR1S"), 
                          test = "Naive-F")
teststat_clubSw <- clubSw$Fstat

test <- WildBootTests$wildboottest("Float64", R, r, resp=df$y, predexog=cbind(1, df$x), clustid=df$firm)
teststat_wild <- WildBootTests$teststat(test) 

teststat_clubSw # 209.5096
teststat_wild # 209.5096

s3alfisc avatar Feb 14 '22 21:02 s3alfisc

Justi FYI, you can find my tests for 'deterministic' equality of Wald Statistics computed via WildBootTests.jl against fixest::wald() here : https://github.com/s3alfisc/fwildclusterboot/blob/develop/tests/testthat/test-tstat_equivalence.R#L176 - they all pass (under v.0.7.6) :)

s3alfisc avatar Mar 13 '22 10:03 s3alfisc