MixedModels.jl icon indicating copy to clipboard operation
MixedModels.jl copied to clipboard

Long-term projects

Open dmbates opened this issue 5 years ago • 3 comments

  • [x] evaluate the trace of the "hat" matrix. I think the best (universal) value of dof_residual may be nobs - tr(H). Alternatively it may be the case that different terms should have different denominator dof according to whether they are between or within grouping factors. Or we could just stay with the current "forget denominator degrees of freedom and assume they are always infinity" approach.

  • [x] Save the lower triangle of the feL matrix in each row of the bstr from parametricbootstrap so that the Wald test results can be derived from that object alone. Naturally this will work fine for full-rank models and need some ugly workarounds for rank-deficient models. Grrr. In retrospect I think it will be better to just save both the estimated coefficients and their standard errors (full-length version in rank-deficient cases).

  • [ ] Drop the weighted model matrices from FeMat and ReMat and create an evaluation chain for passing a copy of .allterms through an update phase when performing weighting or pre-whitening.

  • [ ] If we only use the Distributions package to evaluate some tail probabilities in the coeftable and in likelihoodratiotest we could use SpecialFunctions instead, which is lightweight compared to Distributions. The p-value for a two-sided Z test is erfc(abs(z)/sqrt(2))

dmbates avatar Feb 25 '20 07:02 dmbates

The standard errors of the fixed-effects coefficient estimates, a version of the second item, are now saved in the leanbootstrap branch.

dmbates avatar Mar 12 '20 17:03 dmbates

The leverage method and saving the standard errors in the MixedModelBootstrap object (items 1 and 2) are now part of the master branch.

If we use the leverage results extensively then its method should be tuned up considerably. Right now the approach works but is naive. The ith leverage value is evaluated by replacing the response vector by the ith basis vector then updating A and doing a full updateL!. Only the last row of L changes hence only that row needs to be updated. Also, the fact that L[Block(1,1)] retains the sparsity pattern of A[Block(1,1)] means that L[Block(k,1)] (k is the block index of the last block) will also be sparse. The nonzeros in that block are determined from m.allterms[1].refs[i] and the corresponding column of m.allterms[1].z. If m.allterms[1] is a scalar random effects term then all the updating, etc. of L[Block(k, 1)] can be expressed as scalar operations.

dmbates avatar Mar 22 '20 16:03 dmbates

Mostly a note to myself here, but we still have the indirect dependence on Distributions via GLM. There is also the Chisq CDF in StatsFuns as an alternative for computing p-values.

palday avatar Oct 01 '20 19:10 palday