rstanarm
rstanarm copied to clipboard
rstanarm does the wrong thing with collinear predictors, fails to include the prior when doing some setup
Summary:
A simple model runs fast (as expected) in Stan but runs extremely slowly in rstanarm.
Description:
It's a regression model with collinear predictors and a proper and reasonably strong prior. The posterior is fine and well behaved, but rstanarm mistakenly does some computations with the likelihood (rather than the posterior, or prior-augmented likelihood), causing major problems.
Reproducible Steps:
Here's the R code.
Ilibrary("rstanarm")
library("rstan")
library("cmdstanr")
x1 <- 1:5
x2 <- 1:5
y <- c(1,5,2,3,4)
stan_data <- list(N=length(y), y=y, x1=x1, x2=x2, mu_b=c(0,0,0), sigma_b=c(1,1,1))
model <- cmdstan_model("linear.stan")
fit_1 <- model$sample(data=stan_data, num_chains=4)
fit_1$diagnose()
fit_1$summary()
fake <- data.frame(x1, x2, y)
fit <- stan_glm(y ~ x1 + x2, data=fake, prior_intercept=normal(0, 1, autoscale=FALSE), prior=normal(0, 1, autoscale=FALSE))
And here's the Stan program, linear.stan:
data {
int N;
vector[N] x1;
vector[N] x2;
vector[N] y;
vector[3] mu_b;
vector<lower=0> [3] sigma_b;
}
parameters {
vector[3] b;
real<lower=0> sigma;
}
model {
y ~ normal(b[1] + b[2]*x1 + b[3]*x2, sigma);
b ~ normal(mu_b, sigma_b);
}
RStanARM Version:
Version 2.19.2
R Version:
R version 3.6.1
Operating System:
macOS Catalina, 10.15.1 (but the problem occurs under other operating systems as well)