performance Support for gamm4

Support for gamm4

Open DominiqueMakowski opened this issue 3 years ago • 13 comments

model <- gamm4::gamm4(mpg ~ s(hp), data=mtcars)
performance::performance(model)
#> Warning: Objects of class 'list' are no valid model objects.
#> NULL
model <- gamm4::gamm4(mpg ~ s(hp), random = ~ (1|vs), data=mtcars)
performance::performance(model)
#> Warning: Objects of class 'list' are no valid model objects.
#> NULL

^{Created on 2021-02-16 by the reprex package (v0.3.0)}

Feb 16 '21 09:02 DominiqueMakowski

    model <- gamm4::gamm4(mpg ~ s(hp), data=mtcars)
    
    performance::r2(model$mer)
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.703
#>      Marginal R2: 0.703
    performance::r2(model$gam)
#> $R2
#> Adjusted R2 
#>   0.7371402

^{Created on 2021-02-16 by the reprex package (v0.3.0)}

@strengejacke @mattansb which R2 you think we should "keep" the one from the underlying mer or the one from the gam 🤔

When there are random effects:

    model <- gamm4::gamm4(mpg ~ s(hp), random = ~ (1|vs), data=mtcars)
    
    performance::r2(model$mer)
#> Warning: Can't compute random effect variances. Some variance components equal zero. Your model may suffer from singulariy.
#>   Solution: Respecify random structure!
#>   You may also decrease the 'tolerance' level to enforce the calculation of random effect variances.
#> Random effect variances not available. Returned R2 does not account for random effects.
#> # R2 for Mixed Models
#> 
#>   Conditional R2: NA
#>      Marginal R2: 0.703
    performance::r2(model$gam)
#> $R2
#> Adjusted R2 
#>   0.7371402

^{Created on 2021-02-16 by the reprex package (v0.3.0)}

strange that the R2 is exactly the same in these two models though...

Feb 16 '21 14:02 DominiqueMakowski

I'll have to think a bit about which is correct - maybe we can tag Gavin S.?

(Not surprising they are the same - the warning says there is zero random variance.)

Feb 16 '21 15:02 mattansb

model <- gamm4::gamm4(mpg ~ s(hp), random = ~ (1|vs), data=mtcars)
performance::r2(model$mer, tolerance = 1e-10000)
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.703
#>      Marginal R2: 0.703

^{Created on 2021-02-16 by the reprex package (v1.0.0)}

Feb 16 '21 15:02 strengejacke

When the random factor matters, results are... weird:

GAMM4

model <- gamm4::gamm4(Sepal.Length ~ s(Sepal.Width), random = ~ (1|Species), data=iris)

performance::r2(model$mer)
#> Warning: Can't compute random effect variances. Some variance components equal zero. Your model may suffer from singulariy.
#>   Solution: Respecify random structure!
#>   You may also decrease the 'tolerance' level to enforce the calculation of random effect variances.
#> Random effect variances not available. Returned R2 does not account for random effects.
#> # R2 for Mixed Models
#> 
#>   Conditional R2: NA
#>      Marginal R2: 0.386
performance::r2(model$gam)
#> $R2
#> Adjusted R2 
#>  -0.2833367

model <- gamm4::gamm4(Sepal.Length ~ s(Sepal.Width), data=iris)

performance::r2(model$mer)
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.094
#>      Marginal R2: 0.094
performance::r2(model$gam)
#> $R2
#> Adjusted R2 
#>  0.06801244

^{Created on 2021-02-17 by the reprex package (v0.3.0)}

MGCV

model <- mgcv::gamm(Sepal.Length ~ s(Sepal.Width), random = list("Species" = ~1), data=iris)

# performance::r2(model$lme)  # LME not supported by performance
performance::r2(model$gam)
#> $R2
#> Adjusted R2 
#>  -0.2815428

model <- mgcv::gam(Sepal.Length ~ s(Sepal.Width), data=iris)

performance::r2(model)
#> $R2
#> Adjusted R2 
#>  0.06883041

^{Created on 2021-02-17 by the reprex package (v0.3.0)}

Feb 16 '21 23:02 DominiqueMakowski

maybe we can tag Gavin S.?

Yes please! (don't think I know him)

Feb 16 '21 23:02 DominiqueMakowski

https://twitter.com/mattansb/status/1361939113398726659

Feb 17 '21 07:02 mattansb

Sad that we cannot bump on twitter 😬

Feb 18 '21 11:02 DominiqueMakowski

lol

Feb 18 '21 12:02 mattansb

Dear @gavinsimpson after your great post or random factors & gams, we have no option but to turn to your authority 😁 would you have by any chance some idea or intuition about how to correctly extract the R2 of a gamm model?

Feb 19 '21 23:02 DominiqueMakowski

Here are 8 versions of theoretically equivalent models (with the help of @tjmahr 's new blogpost ☺️), with different engines and random effects entered as such or splines.

library(rstanarm)

# MGCV
model <- mgcv::gam(Sepal.Length ~ s(Sepal.Width) + s(Species, bs="re"), data=iris)
performance::r2(model)
#> $R2
#> Adjusted R2 
#>   0.7202745

model <- mgcv::gamm(Sepal.Length ~ s(Sepal.Width), random = list("Species" = ~1), data=iris)
performance::r2(model$gam)
#> $R2
#> Adjusted R2 
#>  -0.2815428
performance::r2(model$lme)
#> [1] NA

# GAMM4
model <- gamm4::gamm4(Sepal.Length ~ s(Sepal.Width), random = ~ (1|Species), data=iris)
performance::r2(model$gam)
#> $R2
#> Adjusted R2 
#>  -0.2833367
performance::r2(model$mer)
#> Warning: Can't compute random effect variances. Some variance components equal zero. Your model may suffer from singulariy.
#>   Solution: Respecify random structure!
#>   You may also decrease the 'tolerance' level to enforce the calculation of random effect variances.
#> Random effect variances not available. Returned R2 does not account for random effects.
#> # R2 for Mixed Models
#> 
#>   Conditional R2: NA
#>      Marginal R2: 0.386


model <- gamm4::gamm4(Sepal.Length ~ s(Sepal.Width) + s(Species, bs="re"), data=iris)
performance::r2(model$gam)
#> $R2
#> Adjusted R2 
#>   0.7202724
performance::r2(model$mer)
#> Warning: Can't compute random effect variances. Some variance components equal zero. Your model may suffer from singulariy.
#>   Solution: Respecify random structure!
#>   You may also decrease the 'tolerance' level to enforce the calculation of random effect variances.
#> Random effect variances not available. Returned R2 does not account for random effects.
#> # R2 for Mixed Models
#> 
#>   Conditional R2: NA
#>      Marginal R2: 0.386

# RSTANARM
model <- rstanarm::stan_gamm4(Sepal.Length ~ s(Sepal.Width), random = ~ (1|Species), data=iris, refresh = 0, iter=6000, seed=3)
performance::r2(model)
#> # Bayesian R2 with Standard Error
#> 
#>   Conditional R2: 0.712 (0.89% CI [0.658, 0.763])
#>      Marginal R2: 0.358 (0.89% CI [0.240, 0.468])

model <- rstanarm::stan_gamm4(Sepal.Length ~ s(Sepal.Width) + s(Species, bs="re"), data=iris, refresh = 0, iter=6000, seed=3)
#> Warning: There were 8 divergent transitions after warmup. See
#> http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
performance::r2(model)
#> # Bayesian R2 with Standard Error
#> 
#>   Conditional R2: 0.701 (0.89% CI [0.392, 0.903])


# BRMS
library(brms)

model <- brms::brm(Sepal.Length ~ s(Sepal.Width) + (1|Species), data=iris, refresh = 0, iter=6000, backend = "cmdstanr")
performance::r2(model)
#> # Bayesian R2 with Standard Error
#> 
#>   Conditional R2: 0.722 (0.89% CI [0.688, 0.755])
#>      Marginal R2: 0.123 (0.89% CI [0.084, 0.159])

model <- brms::brm(Sepal.Length ~ s(Sepal.Width) + s(Species, bs="re"), data=iris, refresh = 0, iter=6000, backend = "cmdstanr")
performance::r2(model)
#> # Bayesian R2 with Standard Error
#> 
#>   Conditional R2: 0.722 (0.89% CI [0.688, 0.755])

^{Created on 2021-02-27 by the reprex package (v0.3.0)}

It seems like the "true" model's R2 is around 0.72 (assumption based on the most common number from the above).

Some observations:

For mgcv::gamm, the output seems to make no sense 🤷‍♂️
For gamm4, when no random effects entered as such (but as splines), the correct R2 is contained in the $gam sub-object. Note: we currently make a distinction between the two random specifications at insight's level; they are categorized differently (see https://github.com/easystats/insight/issues/254)
For gamm4, when random entered as random, it seems to makes no sense,
For gamm4; the marginal R2 from $mer could be correct? (at least it's consistent and close to the rstanarm output)
For rstanarm output, the output of the random=random model kinda makes sense, but it the specification of random as smooth leads to the model as being treated as a non-mixed one (again possibly related to https://github.com/easystats/insight/issues/254 and, by extension, to how we categorize it in model_info).
The R2 for the random=random model are close, but the one from the random-smooth is a bit lower (?)
For brms, the marginal R2 is lower than the other ones. But again, the conditional R2 is in line with the assumed correct one

[sidenote: just noticed that bayes_r2 shows 0.89%, but maybe I haven't updated since the last a few days ^^]

Feb 27 '21 00:02 DominiqueMakowski

Maybe our new jedi master @bwiernik has some thoughts on that ☺️

May 04 '21 02:05 DominiqueMakowski

Honestly never dived into any of the gam modeling functions 😞

May 04 '21 03:05 bwiernik

The Bayesian R2 conceptually makes the most sense to me for all of these cases. Neither of the two component models includes all of the information that should go into an R2 (similar to ignoring the zero inflation component when computing an R2)

Jun 23 '21 11:06 bwiernik

performance performance copied to clipboard

Support for gamm4

GAMM4

MGCV

performance
performance copied to clipboard