visreg icon indicating copy to clipboard operation
visreg copied to clipboard

Error with mgcv fitted gam

Open nilescbn opened this issue 3 years ago • 6 comments

First, thank you for this package and the documentation. Both have really benefited me.

I've typically had no issue using the package successfully. Yesterday I started getting the following error message when trying to run visreg() on an gam model run with the mgcv package:

Error in exists(tail(as.character(CALL$data), 1), call.env) : invalid first argument

This is perplexing because I was able to run visreg on mgcv objects, on the same fitted models, as recently as yesterday. I have explored variations on the models but I'm fairly certain visreg worked on the same versions I'm getting the error message for now.

I tried the install_github version of visreg yet get the same result.

I've spent ~30 min looking at Stack Overflow and other places for clues and so far haven't had luck.

I just downloaded the mgcViz package and it works with my models.

Thank you for the time. I don't know that this is a bug, I'm guessing not, but would appreciate any tips you might have.

I'm using the RStudio's latest version 1.4.1717. And here is my session info:

R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] visreg_2.7.0.2 mgcv_1.8-37    nlme_3.1-152  

loaded via a namespace (and not attached):
 [1] lattice_0.20-44 digest_0.6.28   grid_4.1.1      evaluate_0.14   rlang_0.4.11   
 [6] renv_0.14.0     Matrix_1.3-4    rmarkdown_2.11  splines_4.1.1   tools_4.1.1    
[11] xfun_0.26       yaml_2.2.1      fastmap_1.1.0   compiler_4.1.1  htmltools_0.5.2
[16] knitr_1.35 

nilescbn avatar Sep 28 '21 18:09 nilescbn

Without a minimal reproducible example, I cannot possibly guess why your code is producing an error.

pbreheny avatar Sep 28 '21 18:09 pbreheny

Okay, I understand but couldn't think of how to do one quickly and I thought there may be some issue with GAMs as it seems there has in the past. I will think harder about how to do a reproducible example.

In the mean time, just to show you that it's a valid model object that I'm having issues with, here's the output from summary().

Family: gaussian 
Link function: identity 

Formula:
log(lbs_dsrk) ~ s(X_km, Y_km) + s(set_year) + set_month_fct + 
    s(SET_DEPTH) + s(HAUL_DURATION)

Parametric coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       3.139194   0.008832 355.444  < 2e-16 ***
set_month_fct.L   0.515514   0.032974  15.634  < 2e-16 ***
set_month_fct.Q   2.120625   0.040316  52.600  < 2e-16 ***
set_month_fct.C  -0.600362   0.031785 -18.889  < 2e-16 ***
set_month_fct^4  -0.162067   0.032327  -5.013 5.37e-07 ***
set_month_fct^5  -0.058738   0.031064  -1.891  0.05865 .  
set_month_fct^6  -0.399912   0.030937 -12.927  < 2e-16 ***
set_month_fct^7  -0.070763   0.030931  -2.288  0.02216 *  
set_month_fct^8   0.081701   0.030662   2.665  0.00771 ** 
set_month_fct^9  -0.037612   0.030387  -1.238  0.21580    
set_month_fct^10 -0.252248   0.030255  -8.337  < 2e-16 ***
set_month_fct^11 -0.063196   0.030749  -2.055  0.03986 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
                    edf Ref.df      F p-value    
s(X_km,Y_km)     28.518 28.982  85.76  <2e-16 ***
s(set_year)       8.727  8.976 148.06  <2e-16 ***
s(SET_DEPTH)      8.225  8.787 481.93  <2e-16 ***
s(HAUL_DURATION)  8.629  8.942  94.79  <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.212   Deviance explained = 21.3%
GCV = 3.2023  Scale est. = 3.1972    n = 41182

The model object is called fit_lbs. The error occurs when I try running visreg like so.

visreg(fit_lbs)

It's class "gam" "glm" "lm".

Again, I will try and reproduce the error with a simpler model.

Thank you.

nilescbn avatar Sep 28 '21 19:09 nilescbn

Well, I doubt it has anything to do with the model itself, but I don't even know what the call to gam() looks like.

pbreheny avatar Sep 28 '21 20:09 pbreheny

I see. It was this (and again, this is with the mgcv package):

fit_lbs <- gam(log(lbs_dsrk) ~ s(X_km, Y_km) + s(set_year) + set_month_fct + s(SET_DEPTH) + s(HAUL_DURATION),  data = hauls_analysis[pos_dsrk == 1 & rev_category_dsrk == "zero", ], 
                       family = gaussian)

It's happening with a separate model as well, run on the same data but it's a logistic/binomial model.

Again, while I can't be 100% it was this same version of the model, I know visreg worked perfectly on at least something very similar just yesterday.

nilescbn avatar Sep 28 '21 20:09 nilescbn

I see; the issue is with

data = hauls_analysis[pos_dsrk == 1 & rev_category_dsrk == "zero", ]

I.e., applying an operation to the data during the call to gam(). This used to work fine, but R 4.0 changed some things and not every bug has been tracked down yet. Thank you very much for bringing this to my attention -- I'll fix it as soon as I can. In the meantime, you can avoid the error by subsetting the data outside the call to gam():

Sub <- subset(hauls_analysis, pos_dsrk == 1 & rev_category_dsrk == "zero")
fit <- gam(..., data=Sub)

pbreheny avatar Sep 28 '21 20:09 pbreheny

Okay, that is something I did change yesterday afternoon (i.e. started using a subset of the data in the model). I would never have thought that on my own, so thank you for the quick replies. Very much appreciated.

And, yes, visreg() is indeed working now after following your recommendation.

nilescbn avatar Sep 28 '21 20:09 nilescbn