butcher
butcher copied to clipboard
Consider a different style of butchering for use cases that prioritize size of model object
In rstudio/vetiver-r#264 @lschneiderbauer pointed out for their use case, they would like to remove more components. They don't need the components used for prediction/confidence intervals, and do need the model object to be smaller. They would like something along these lines:
library(butcher)
library(vetiver)
more_cars <- mtcars[rep(1:32, each = 1e4),]
cars_lm <- lm(mpg ~ ., data = more_cars)
weigh(cars_lm)
#> # A tibble: 25 × 2
#> object size
#> <chr> <dbl>
#> 1 qr.qr 54.0
#> 2 residuals 28.4
#> 3 fitted.values 28.4
#> 4 effects 5.12
#> 5 model.mpg 2.56
#> 6 model.cyl 2.56
#> 7 model.disp 2.56
#> 8 model.hp 2.56
#> 9 model.drat 2.56
#> 10 model.wt 2.56
#> # ℹ 15 more rows
axe_custom <- function(x) {
old <- x
x <- butcher:::exchange(x, "residuals", numeric(0))
x$qr <- butcher:::exchange(x$qr, "qr", matrix(0))
x
}
axed_lm <- axe_custom(cars_lm)
weigh(axed_lm)
#> # A tibble: 25 × 2
#> object size
#> <chr> <dbl>
#> 1 fitted.values 28.4
#> 2 effects 5.12
#> 3 model.mpg 2.56
#> 4 model.cyl 2.56
#> 5 model.disp 2.56
#> 6 model.hp 2.56
#> 7 model.drat 2.56
#> 8 model.wt 2.56
#> 9 model.qsec 2.56
#> 10 model.vs 2.56
#> # ℹ 15 more rows
v <- vetiver_model(axed_lm, "custom-butchered-lm")
weigh(v)
#> # A tibble: 37 × 2
#> object size
#> <chr> <dbl>
#> 1 model.effects 5.12
#> 2 model.model.mpg 2.56
#> 3 model.model.cyl 2.56
#> 4 model.model.disp 2.56
#> 5 model.model.hp 2.56
#> 6 model.model.drat 2.56
#> 7 model.model.wt 2.56
#> 8 model.model.qsec 2.56
#> 9 model.model.vs 2.56
#> 10 model.model.am 2.56
#> # ℹ 27 more rows
Created on 2023-11-30 with reprex v2.0.2
Should we consider a different style of butchering that prioritizes simple predictions only and throws out big components like the ones used for confidence/prediction intervals?