ciTools icon indicating copy to clipboard operation
ciTools copied to clipboard

need response variable in new data for add_pi

Open nlichti opened this issue 4 years ago • 4 comments

Very useful package. I have a minor suggestion: in add_pi (and possibly other functions - I haven't checked), an error is thrown if tb does not include a column for the response variable. The actual values in the column are ignored, but the column has to be present. I'd guess this is due to some internal code similar to: X <- formula(fit) %>% model.matrix(data = tb) used to get the design matrix for simulation-based prediction intervals. Using a few more steps should eliminate the need to include the response. Something like: chr_formula <- formula(fit) %>% deparse() %>% strsplit(' ') %>% getElement(1) X <- as.formula(chr_formula[-1]) %>% model.matrix(data = tb) I noticed this specifically with a Poisson GLM. add_ci did not require the response to be in tb.

nlichti avatar Dec 05 '19 15:12 nlichti

Thanks for this note. I will take a look.

jthaman avatar Aug 19 '20 13:08 jthaman

I think I might be running into a bug related to this. Here's a reprex:

library(tidyverse)
mod <- glm(mpg ~ disp, family = Gamma(), data = mtcars)
tibble(disp = seq(min(mtcars$disp), max(mtcars$disp), length.out = 10)) %>% 
    ciTools::add_pi(mod)
#> Error in model.frame.default(formula = mpg ~ disp, data = structure(list(: invalid type (list) for variable 'mpg'

Created on 2021-09-05 by the reprex package (v2.0.0)

Weirdly, no bug if I leave out the family argument:

library(tidyverse)
mod <- glm(mpg ~ disp, data = mtcars)
tibble(disp = seq(min(mtcars$disp), max(mtcars$disp), length.out = 10)) %>% 
    ciTools::add_pi(mod)
#>        disp     pred  LPB0.025 UPB0.975
#> 1   71.1000 26.66946 19.753419 33.58550
#> 2  115.6444 24.83356 17.999921 31.66719
#> 3  160.1889 22.99765 16.220266 29.77504
#> 4  204.7333 21.16175 14.413797 27.90969
#> 5  249.2778 19.32584 12.580164 26.07152
#> 6  293.8222 17.48994 10.719340 24.26053
#> 7  338.3667 15.65403  8.831623 22.47644
#> 8  382.9111 13.81813  6.917619 20.71864
#> 9  427.4556 11.98222  4.978206 18.98624
#> 10 472.0000 10.14632  3.014492 17.27814

Created on 2021-09-05 by the reprex package (v2.0.0)

FlukeAndFeather avatar Sep 05 '21 22:09 FlukeAndFeather

Ran into this bug as well.

akarlinsky avatar Mar 08 '22 14:03 akarlinsky

Anyone find a way around it? I can't estimate a PI due to this bug. I tried estimating the glm with y=TRUE to keep the dependent variable in the model object. I also tried creating a dependent variable column. No luck :(

akarlinsky avatar Jan 18 '23 09:01 akarlinsky