brms
brms copied to clipboard
feature request: Handling Inf in data
Description of current behavior
When Inf
or -Inf
are encountered in data, brm
passes these rows to Stan, which fails because it is not able to evaluate the lp at the initial values. I think the standard troubleshooting for this error is to specify init = ...
and/or to use more informative priors. But Stan will fail with this same error regardless of the initial values or priors specified. This appears to be a fitting issue, when in reality the source of the problem is in the data.
reprex:
library(brms)
x <- 0:100
mu <- 10 + 0.3 * x
y <- rnorm(mu, sd = 2)
dat <- data.frame(x, y)
dat$y[1] <- Inf
mod <- brm(y ~ 1 + x, data = dat) # fails with Stan initialization error
mod2 <- lm(y ~ 1 + x, data = dat) # base R regression gives a (somewhat) informative error in the same circumstance
Desired feature behavior
I think the best approach would be to stop the model fitting with an informative error instead of a warning. Infinite values are likely artifacts of errors during the calculation of variables and warrant re-examination before fitting any model (e.g., dividing by 0, log-transforming 0, etc.).
A softer approach would be to drop rows containing infinite values with a warning on the R side, then pass the cleaned data to Stan for fitting. This mirrors the handling of rows containing NA
(absent user-specified imputation with mi()
). I don't favor this approach because I'm not able to think of cases when it's still reasonable to fit a model after learning that some of the variable values are infinite.