performance icon indicating copy to clipboard operation
performance copied to clipboard

r2_nakagawa breaking when used with purrr::map

Open seanhardison1 opened this issue 4 years ago • 12 comments

I've found that r2_nakagawa breaks when used in a purrr::map context.

Here's a simple example where r2_nakagawa() is used with purrr::map. The same two models are fitted to each species in the Salamanders data set, and Nakagawa's R^2 is collected for each.

library(performance)
library(glmmTMB)
library(tidyverse)

mod_func <- function(df){
  
  form_list <- list()
  form_list[[1]] <- formula(count ~ mined + (1|site))
  form_list[[2]] <- formula(count ~ mined + DOY + (1|site))
  fam <- "poisson"
  
  r2_out <- NULL
  for (i in 1:length(form_list)){
    mod <- glmmTMB(form_list[[i]], data = df, family = fam)
    r2_out[[i]] <- r2_nakagawa(mod)
  }
  return(r2_out)
}

nested_mods <- 
  Salamanders %>% 
  group_by(spp) %>% 
  nest() %>% 
  mutate(model = map(data, mod_func))

nested_mods$model

The output gives conditional R^2 = 1 for each model

However, when models are specified individually, r2_nakagawa works as expected

# but in isolation r2_nakagawa works, even when formula is specified as a list
form_list <- list()
form_list[[1]] <- formula(count ~ mined + (1|site))
fam <- "poisson"
mod <- glmmTMB(form_list[[1]], data = Salamanders %>% 
                 dplyr::filter(spp == "GP"), family = fam)
r2_nakagawa(mod)

seanhardison1 avatar Sep 22 '20 15:09 seanhardison1

I'm also having an issue with purrr::map() and the performance::r2() function. I have re-installed performance from Github, but the issue still persists. For me, it doesn't recognise the data frame. The error I get is:

Error: Problem with `mutate()` input `r2`.
x object '.x' not found
ℹ Input `r2` is `map(multinomial_regression, ~r2(.x))`.

I don't have this issue with performance::check_collinearity().

My session info:

sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin19.5.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /usr/local/Cellar/openblas/0.3.10_1/lib/libopenblasp-r0.3.10.dylib

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] performance_0.5.0 ggthemes_4.2.0    ggtext_0.1.0      nnet_7.3-14       forcats_0.5.0     stringr_1.4.0     dplyr_1.0.2       purrr_0.3.4       readr_1.3.1      
[10] tidyr_1.1.2       tibble_3.0.4      ggplot2_3.3.2     tidyverse_1.3.0  

loaded via a namespace (and not attached):
  [1] readxl_1.3.1         backports_1.1.10     plyr_1.8.6           igraph_1.2.5         listenv_0.8.0        crosstalk_1.1.0.1    rstantools_2.1.1     inline_0.3.15       
  [9] digest_0.6.26        htmltools_0.5.0      rsconnect_0.8.16     fansi_0.4.1          magrittr_1.5         globals_0.12.5       brms_2.13.5          modelr_0.1.8        
 [17] RcppParallel_5.0.2   matrixStats_0.56.0   MCMCpack_1.4-9       xts_0.12.1           sandwich_2.5-1       prettyunits_1.1.1    colorspace_1.4-1     blob_1.2.1          
 [25] rvest_0.3.6          haven_2.3.1          callr_3.5.1          crayon_1.3.4         jsonlite_1.7.1       zoo_1.8-8            glue_1.4.2           gtable_0.3.0        
 [33] MatrixModels_0.4-1   V8_3.2.0             pkgbuild_1.1.0       rstan_2.21.2         future.apply_1.6.0   abind_1.4-5          SparseM_1.78         scales_1.1.1        
 [41] mvtnorm_1.1-1        DBI_1.1.0            miniUI_0.1.1.1       Rcpp_1.0.5           xtable_1.8-4         gridtext_0.1.1       tmvnsim_1.0-2        stats4_4.0.2        
 [49] StanHeaders_2.21.0-6 DT_0.15              htmlwidgets_1.5.1    httr_1.4.2           threejs_0.3.3        lavaan_0.6-7         ellipsis_0.3.1       pkgconfig_2.0.3     
 [57] loo_2.3.1            dbplyr_1.4.4         here_0.1             tidyselect_1.1.0     rlang_0.4.8          reshape2_1.4.4       later_1.1.0.1        munsell_0.5.0       
 [65] cellranger_1.1.0     tools_4.0.2          cli_2.1.0            generics_0.0.2       broom_0.7.0          ggridges_0.5.2       fastmap_1.0.1        blavaan_0.3-10      
 [73] mcmc_0.9-7           processx_3.4.4       fs_1.5.0             future_1.18.0        nlme_3.1-148         mime_0.9             quantreg_5.70        xml2_1.3.2          
 [81] nonnest2_0.5-5       compiler_4.0.2       bayesplot_1.7.2      shinythemes_1.1.2    rstudioapi_0.11      curl_4.3             reprex_0.3.0         pbivnorm_0.6.0      
 [89] stringi_1.4.6        ps_1.4.0             Brobdingnag_1.2-6    lattice_0.20-41      Matrix_1.2-18        markdown_1.1         shinyjs_2.0.0        vctrs_0.3.4         
 [97] CompQuadForm_1.4.3   pillar_1.4.6         lifecycle_0.2.0      bridgesampling_1.0-0 insight_0.9.6.1      conquer_1.0.2        httpuv_1.5.4         R6_2.4.1            
[105] promises_1.1.1       gridExtra_2.3        codetools_0.2-16     colourpicker_1.1.0   MASS_7.3-51.6        gtools_3.8.2         assertthat_0.2.1     rprojroot_1.3-2     
[113] withr_2.3.0          shinystan_2.5.0      mnormt_2.0.2         bayestestR_0.7.2     parallel_4.0.2       hms_0.5.3            grid_4.0.2           coda_0.19-3         
[121] shiny_1.5.0          lubridate_1.7.9      base64enc_0.1-3      dygraphs_1.1.1.6    

zmeers avatar Oct 17 '20 22:10 zmeers

@seanhardison1 I found the reason why this does not work... Due to the structure of the nested data / models, update() fails, which is required for the null-model:

library(performance)
library(glmmTMB)
library(tidyverse)

mod_func <- function(df){
  
  form_list <- list()
  form_list[[1]] <- formula(count ~ mined + (1|site))
  form_list[[2]] <- formula(count ~ mined + DOY + (1|site))
  fam <- "poisson"
  
  mod <- glmmTMB(form_list[[1]], data = df, family = fam)
}

nested_mods <- 
  Salamanders %>% 
  group_by(spp) %>% 
  nest() %>% 
  mutate(model = map(data, mod_func))

# doesn't work
insight::null_model(nested_mods$model[[1]])
#> Can't calculate null-model. Probably the data that was used to fit the model cannot be found.
#> NULL

# reason
update(nested_mods$model[[1]], formula. = ~1)
#> Error in glmmTMB(formula = count ~ 1, data = df, family = fam, ziformula = ~0, : object 'fam' not found


form_list <- list()
form_list[[1]] <- formula(count ~ mined + (1|site))
fam <- "poisson"
mod <- glmmTMB(form_list[[1]], data = Salamanders %>% 
                 dplyr::filter(spp == "GP"), family = fam)

# works
insight::null_model(mod)
#> Formula:          count ~ (1 | site)
#> Data: Salamanders %>% dplyr::filter(spp == "GP")
#>       AIC       BIC    logLik  df.resid 
#>  260.7390  265.7826 -128.3695        90 
#> Random-effects (co)variances:
#> 
#> Conditional model:
#>  Groups Name        Std.Dev.
#>  site   (Intercept) 1.845   
#> 
#> Number of obs: 92 / Conditional model: site, 23
#> 
#> Fixed Effects:
#> 
#> Conditional model:
#> (Intercept)  
#>     -0.9884

Created on 2020-10-22 by the reprex package (v0.3.0)

I'm not sure how to address this at the moment, but I'll let this issue open, in case I find a solution later.

strengejacke avatar Oct 22 '20 18:10 strengejacke

@zmeers Do you have a small reproducible example?

strengejacke avatar Oct 22 '20 18:10 strengejacke

Interestingly, for a simpler case, it works with mixed models:

library(mice)
library(lme4)
library(performance)
data("nhanes2")
imp <- mice(nhanes2)
models <- lapply(1:5, function(i) {
  lme4::lmer(bmi ~ age + hyp + (1 | chl), data = complete(imp, action = i))
})

lapply(models, r2)
#> Warning: Can't compute random effect variances. Some variance components equal zero.
#>   Solution: Respecify random structure!
#> Random effect variances not available. Returned R2 does not account for random effects.
#> [[1]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.917
#>      Marginal R2: 0.383
#> 
#> [[2]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.796
#>      Marginal R2: 0.283
#> 
#> [[3]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.943
#>      Marginal R2: 0.303
#> 
#> [[4]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: NA
#>      Marginal R2: 0.286
#> 
#> [[5]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.906
#>      Marginal R2: 0.278

Created on 2020-10-22 by the reprex package (v0.3.0)

strengejacke avatar Oct 22 '20 18:10 strengejacke

I am new to posting issues, feedback is welcomed and I really appreciate your great software.

library(tidyverse)
library(performance)
library(nlme)
library(broom)

data(iris)

model <- nlme::lme(Sepal.Length ~ Petal.Length, random = ~1 | Species, data = iris)
r2(model)
icc(model)

# ICC for lmer models, if I could get performance::icc to work in the map / mutate that would be better.
ICC_lme <- function(out){
  varests <- as.numeric(VarCorr(out)[1:2])
  varests[1]/sum(varests)
}




boot_straps <- iris %>% 
  modelr::bootstrap(n = 100, id = 'model_iter') %>%
  mutate(mod_1 = map(strap, ~nlme::lme(Sepal.Length ~ Petal.Length, random = ~1 | Species, data =.)),
         mod_2 = map(strap, ~nlme::lme(Sepal.Length ~ Petal.Width, random = ~1 | Species, data =.)),
         
         tidy_mod_1 = map(mod_1, tidy),
         tidy_mod_2 = map(mod_2, tidy),
         
         AIC_1 = map(mod_1, AIC),
         AIC_2 = map(mod_2, AIC),
         
         ICC_1 = map(mod_1, ICC_lme),
         ICC_2 = map(mod_2, ICC_lme),
         
         R2_1 = map(mod_1, performance::r2),
         R2_2 = map(mod_2, performance::r2)
         )

boot_straps %>% 
  tidylog::select(model_iter, starts_with('AIC')) %>%
  unnest() %>% 
  pivot_longer(cols = -model_iter,
               names_to = 'model',
               values_to = 'AIC') 

boot_straps %>% 
  tidylog::select(model_iter, starts_with('R2_')) %>%
  unnest() 

boot_straps$R2_1

DHLocke avatar Oct 22 '20 18:10 DHLocke

@DHLocke With this commit, I added an informative warning related to your issue.

This code-line:

map(strap, ~nlme::lme(Sepal.Length ~ Petal.Length, random = ~1 | Species, data =.))

causes that the model objects have no data:

boot_straps$mod_1[[1]]$data
#> NULL

while this works:

library(nlme)
data(iris)
model <- nlme::lme(Sepal.Length ~ Petal.Length, random = ~1 | Species, data = iris)
head(model$data)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa

Not sure how to address this, but I'll try to investigate further...

strengejacke avatar Nov 02 '20 16:11 strengejacke

@DHLocke The reason seems to be one of the tidyverse-quirks, which you can, however, workaround when not using anonymous functions inside map():

library(tidyverse)
library(performance)
library(nlme)
library(broom)
library(broom.mixed)
data(iris)

boot_straps <- iris %>% 
  modelr::bootstrap(n = 10, id = 'model_iter') %>%
  mutate(mod_1 = map(strap, function(d) nlme::lme(Sepal.Length ~ Petal.Length, random = ~1 | Species, data = as.data.frame(d))),
         mod_2 = map(strap, function(d) nlme::lme(Sepal.Length ~ Petal.Width, random = ~1 | Species, data = as.data.frame(d))),
         
         tidy_mod_1 = map(mod_1, tidy),
         tidy_mod_2 = map(mod_2, tidy),
         
         AIC_1 = map(mod_1, AIC),
         AIC_2 = map(mod_2, AIC),
         
         ICC_1 = map(mod_1, performance::icc),
         ICC_2 = map(mod_2, performance::icc),
         
         R2_1 = map(mod_1, performance::r2),
         R2_2 = map(mod_2, performance::r2)
  )

boot_straps$R2_1
#> [[1]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.975
#>      Marginal R2: 0.621
#> 
#> [[2]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.981
#>      Marginal R2: 0.646
#> 
#> [[3]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.965
#>      Marginal R2: 0.642
#> 
#> [[4]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.973
#>      Marginal R2: 0.643
#> 
#> [[5]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.962
#>      Marginal R2: 0.658
#> 
#> [[6]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.976
#>      Marginal R2: 0.674
#> 
#> [[7]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.963
#>      Marginal R2: 0.618
#> 
#> [[8]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.974
#>      Marginal R2: 0.632
#> 
#> [[9]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.968
#>      Marginal R2: 0.629
#> 
#> [[10]]
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.974
#>      Marginal R2: 0.641

Created on 2020-11-03 by the reprex package (v0.3.0)

strengejacke avatar Nov 03 '20 08:11 strengejacke

@DHLocke The issue with non-working icc() for lme() should be fixed in the latest GitHub-version of performance as well.

strengejacke avatar Nov 03 '20 08:11 strengejacke

Wonderful, thanks so much.

DHLocke avatar Nov 10 '20 17:11 DHLocke

I think I'm having a similar issue, but when calling r2_nakagawa from inside a function (i.e. a more general issue than map).

Maybe stats::update expects to be called in the global scope and can't find the data otherwise? So it could perhaps be fixed by taking the data as an argument and somehow including it in the environment for that call? I don't know enough about scoping etc to be at all confident though...

Thanks for a nicely comprehensive package. Will

MWE

library("glmmTMB")
library("performance")

mwe <- function() {
  Owls <- transform(Owls,
    Nest=reorder(Nest,NegPerChick),
    NCalls=SiblingNegotiation,
    FT=FoodTreatment)
  fit <- glmmTMB(
    NCalls ~ (FT + ArrivalTime) * SexParent + offset(log(BroodSize)) + (1 | Nest),
    data=Owls, family=nbinom2)
  print(r2_nakagawa(fit))
}
mwe()
# Can't calculate null-model. Probably the data that was used to fit the model cannot be found.
# # R2 for Mixed Models

#   Conditional R2: 1.000
#      Marginal R2: 0.593

whereas if I call the same example in the global scope:

Owls <- transform(Owls,
  Nest=reorder(Nest,NegPerChick),
  NCalls=SiblingNegotiation,
  FT=FoodTreatment)
fit <- glmmTMB(
  NCalls ~ (FT + ArrivalTime) * SexParent + offset(log(BroodSize)) + (1 | Nest),
  data=Owls, family=nbinom2)
print(r2_nakagawa(fit))
# # R2 for Mixed Models

#   Conditional R2: 0.244
#      Marginal R2: 0.144

(I know that I'm slightly out of date on updating packages, but I'm hoping that's not the issue here.)

sessionInfo()
# R version 4.0.3 (2020-10-10)
# Platform: x86_64-conda-linux-gnu (64-bit)
# Running under: CentOS Linux 7 (Core)

# Matrix products: default
# BLAS/LAPACK: /pstore/home/macnairw/.conda/envs/r_4.0.3/lib/libopenblasp-r0.3.12.so

# locale:
#  [1] LC_CTYPE=C                 LC_NUMERIC=C
#  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
#  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
#  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
#  [9] LC_ADDRESS=C               LC_TELEPHONE=C
# [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base

# other attached packages:
# [1] performance_0.7.2 glmmTMB_1.0.2.1   colorout_1.2-2    workflowr_1.6.2

# loaded via a namespace (and not attached):
#  [1] Rcpp_1.0.6        pillar_1.6.1      compiler_4.0.3    later_1.2.0
#  [5] nloptr_1.2.2.2    TMB_1.7.20        tools_4.0.3       boot_1.3-28
#  [9] digest_0.6.27     lme4_1.1-27.1     evaluate_0.14     lifecycle_1.0.0
# [13] tibble_3.1.2      nlme_3.1-152      lattice_0.20-44   pkgconfig_2.0.3
# [17] rlang_0.4.11      Matrix_1.3-4      xfun_0.24         knitr_1.33
# [21] fs_1.5.0          vctrs_0.3.8       grid_4.0.3        R6_2.5.0
# [25] fansi_0.5.0       rmarkdown_2.8     minqa_1.2.4       magrittr_2.0.1
# [29] promises_1.2.0.1  ellipsis_0.3.2    htmltools_0.5.1.1 splines_4.0.3
# [33] MASS_7.3-54       insight_0.14.2    httpuv_1.6.1      utf8_1.2.1
# [37] crayon_1.4.1

wmacnair avatar Jul 07 '21 11:07 wmacnair

This all sounds like an environment scoping issue @strengejacke

bwiernik avatar Jul 08 '21 03:07 bwiernik

Yeah, I must see when I find some time to get back to this issue again...

strengejacke avatar Aug 18 '21 15:08 strengejacke