insight icon indicating copy to clipboard operation
insight copied to clipboard

insight::get_data issue with subset argument provided via eval(parse(text=...))

Open AlpDYel opened this issue 2 years ago • 5 comments

I have fitted a bunch of models using a loop that uses eval(parse(text = thing_I_want)) to fit model and family. Now I have loaded the resulting models and I am using insight::get_data() on the model object but I get the error

Error in as.character(x) :
cannot coerce type 'closure' to vector of type 'character'

insight_get_data error

AlpDYel avatar Oct 12 '23 17:10 AlpDYel

Do you have a reproducible example?

strengejacke avatar Oct 25 '23 19:10 strengejacke

I will try to get one. I am using a custom function to fit and save models in parallel and then load and name the resulting models so I would have to simplify certain things from the pipeline to get it to reproduce without publishing a bunch of tangentially related code.

AlpDYel avatar Nov 09 '23 19:11 AlpDYel

So I had to revisit the issue in a new project and I went ahead and isolated the issue:

ibrary(tidyverse)
library(lme4)
library(performance)

data(troutegg,package="faraway")
  
#This does not work
fit_mod <- function(formula,weights,data,family,subset){
  mod <-glmer(as.formula(formula),weight=eval(parse(text=weights)),
              data=data,family= eval(parse(text = family)),subset=eval(parse(text=subset)))
  return(mod)
}
mod <- fit_mod(formula = "survive/total~period+(1|location)",
               weights = "total",
               data=troutegg,
               family="binomial('logit')",
               subset = "total>0")

performance::binned_residuals(mod)
#This works
mod2 <-glmer(as.formula("survive/total~period+(1|location)"),weight=eval(parse(text="total")),
             data=troutegg,nAGQ=0,family= eval(parse(text = "binomial('logit')")))
performance::binned_residuals(mod2)

#Digging deeper, problem comes from
insight::get_data(mod)

#This does not work too...
fit_mod_alt <- function(formula,data,weights,troutegg,family,subset){
  mod <-glmer(as.formula(formula),weight=eval(parse(text=weights)),
              data=get(data,envir = .GlobalEnv),family= eval(parse(text = family)),subset=eval(parse(text=subset)))
  return(mod)
}
mod3 <- fit_mod_alt(formula = "survive/total~period+(1|location)",
               weights = "total",
               data="troutegg",
               family="binomial('logit')",
               subset = "total>0")
performance::binned_residuals(mod3)

If I understand correctly, the data for the model shows up as data internally when fitted with the fit_mod function (as opposed to troutegg), which causes issues with the get_data function from insight that is used internally within some performance library functions such as binned_residuals

AlpDYel avatar Feb 10 '24 01:02 AlpDYel

I also tried assigning troutegg to data to see if that helps, it also did not

#This also does not work
data <- troutegg
fit_mod_alt <- function(formula,data,weights,troutegg,family,subset){
  mod <-glmer(as.formula(formula),weight=eval(parse(text=weights)),
              data=data,family= eval(parse(text = family)),subset=eval(parse(text=subset)))
  return(mod)
}
mod3 <- fit_mod_alt(formula = "survive/total~period+(1|location)",
               weights = "total",
               data=data,
               family="binomial('logit')",
               subset = "total>0")
performance::binned_residuals(mod3)

AlpDYel avatar Feb 10 '24 02:02 AlpDYel

So I finally figured out the root of the issue:

The get_data function does not work intended when I used the subset argument. It works OK when I subset within the as an input (example with tidyverse filter) as opposed to the subset argument. It is likely a none-issue for 99% of the cases.

#Works!
mod4 <- fit_mod_alt(formula = "survive/total~period+(1|location)",
                    weights = "total",
                    data= troutegg,
                    family="binomial('logit')")
insight::get_data(mod4)
#Alternative to subset with tidyverse
mod5 <- fit_mod_alt(formula = "survive/total~period+(1|location)",
                    weights = "total",
                    data= troutegg %>% filter(eval(parse(text="location !=5"))),
                    family="binomial('logit')")

insight::get_data(mod5)

AlpDYel avatar Feb 12 '24 16:02 AlpDYel