StatsModels.jl icon indicating copy to clipboard operation
StatsModels.jl copied to clipboard

lrtest incorrect

Open jerlich opened this issue 2 years ago • 3 comments

I initially posted this issue (incorrectly) to GLM.jl https://github.com/JuliaStats/GLM.jl/issues/490

lrtest tests a difference in deviance, but the correct definition for the test statistic is twice the difference in the loglikelihood.

I have edited lrtest to be consistent with lmtest::lrtest in R. But this breaks tests. in lrtest.jl lines 108:119 become

    dev = deviance.(mods)
    Δdev = _diff(dev)

    loglik = loglikelihood.(mods)
    Δloglik = _diff(loglik)

    Δdf = _diff(df)
    dfr = Int.(dof_residual.(mods))

    if (forward && any(x -> x > 0, Δdev)) || (!forward && any(x -> x < 0, Δdev))
        throw(ArgumentError("Residual deviance must be strictly lower " *
                            "in models with more degrees of freedom"))
    end

    pval = (NaN, chisqccdf.(abs.(Δdf), 2 .* abs.(Δloglik))...)
    

I'm curious how the 'correct' pvals for tests were set.

jerlich avatar Jul 31 '22 23:07 jerlich

Tangentially relevant but "residual deviance must be strictly lower in models with more degrees of freedom" seems not to be the case in practice for all types of models, and is (at least partially?) why MixedModels.jl defines its own likelihood ratio test function.

ararslan avatar Aug 01 '22 15:08 ararslan

I'm curious how the 'correct' pvals for tests were set.

The test in StatsModels is really just a basic check that code works. We should put more thorough tests in GLM to cover all model families and cross check against other implementations.

nalimilan avatar Aug 02 '22 13:08 nalimilan

Why MixedModels.jl defines its own LRT:

  1. We needed it while the lrtest PR was stalled and so we made it and released it.
  2. isnested is a challenging question for mixed models. Even when you have the formula specification, it may be non obvious, e.g. y ~ 1 + (1|g) + (1|g&x) is nested within y ~ 1 + (1+x|g) (the former is equivalent to the latter with a compound symmetry restriction on the random effects covariance matrix). There are also all sorts of weird rules -- you can use a chi-squared test to compare models fitted with REML if their fixed-effects specifications are the same, but that's not actually a likelihood ratio test. (And indeed loglikelihood errors on those models.) So we have a stricter, more conservative heuristic in lrtest (safe and consistent with the broader JuliaStats ecosystem) but far fewer checks in likelihoodratiotest (for people who are focused on mixed models and know what they're doing). So users can pick their level of adventure. 😉
  3. Strictly speaking, the "deviance" in mixed models for LMM is off by some unknown additive constant because defining the fully saturated model is non trivial for a mixed model. This doesn't matter for its use in a likelihood ratio test because that additive constant cancels out.
  4. We also define custom methods for comparing GLM.LinearModel with a LinearMixedModel for the case where that's well defined.
  5. We have a whole slew of pretty-printing methods.

We could revisit these design decisions and:

  1. create specialized methods for StatsModels.lrtest with kwargs for e.g. nesting checks instead of just defining loglikelihood and isnested like we do now.
  2. create an lrtest method for comparing one GLM.LinearModel to varargs LinearMixedModels, but that's borderline type piracy when we don't own the method nor one of the type
  3. upstream pretty printing code, including some for other MIME types

This would have to happen as part of a breaking release, but it's definitely not enough for us to do a breaking release on its own (just too much effort in terms of rewrites and compatibility). In other words, I think the current situation makes sense but we should probably document it more thoroughly. And I'm not sure that changing our likelihoodratiotest to just be methods of lrtest with incompatible kwargs is actually a usability boost.

Part of the fundamental difficulty with mixed models -- both computationally and in terms of users' understanding -- is the way foundational regression theory is taught. Many equalities that hold for a particular subset of the exponential family are taught as the actual definitions of things like deviance, residual degrees of freedom, etc., instead of special cases. (@ararslan you've also felt this with beta regression!)

Addressing @ararslan's comment: I don't remember the default isnested definition, but maybe the "residual deviance must be strictly lower in models with more degrees of freedom" check should be part of isnested instead of lrtest itself. Or maybe some other function that can have methods defined for new model types without defining needing to specialize lrtest.

palday avatar Aug 02 '22 15:08 palday