bridgesampling
bridgesampling copied to clipboard
Bridgesampling Error
When I run bridgesampling on a stan fit I get the following error
`Iteration: 1
Error in out[!from@positive] <- -out[!from@positive] :
NAs are not allowed in subscripted assignments
In addition: Warning messages:
1: 16000 of the 16000 log_prob() evaluations on the posterior draws produced -Inf/Inf.
2: 16000 of the 16000 log_prob() evaluations on the proposal draws produced -Inf/Inf.`
in my fit object there is one variable that has NA values in it but it's one that is not of interest to me. WIll this prevent bridgesampling from running requiring me to re run stan fit? Or Are there any workarounds to help bridgesampling still run
Thank you LS
Hey, I'm facing the same issue... However I don't find any NAs in my fit object. Has there ever been a solution?
Okay, I don't know whether this will help you @lsolomyak (also considering you asked over a year ago), but in my case the problem was related to this bug in Stan: https://github.com/stan-dev/rstan/issues/649
In short, when you write samples to .csv files and tell Stan not to save the warmup samples, it will still write n_samples + n_warmup rows in the sample files.
The result is that the first n_samples rows are filled with your samples and the trailing n_warmup rows are just filled with a bunch of zeros.
I have written a function that takes the stanfit object and cuts away the superfluous rows. You can find it here: https://gist.github.com/ktrutmann/e8c5b47f8467ac7f60e56412ab899cdd
Hope this helps anyone else runing into this issue.
Hi there,
I'm also seeing this error, but possibly for a different reason. I'm doing a simulation study in which I estimate Bayes factors repeatedly a given dataset and model (i.e. I fit the brms-model repeatedly to the same data). On most simulation runs this works like a charm, but in a few runs I get the above error. I don't save the samples but immediately call bridge_sampling() in the same session.
Reproducible example
The model was fit using cmdstanr, so this requires a recent version of brms. I use:
package * version date lib source
brms * 2.16.1 2021-08-23 [1] CRAN (R 4.1.1)
(If I remember correctly, I also sometimes got this error using the rstan backend, but I'm not entirely sure at this point.)
Download the stanfit-object here.
library("brms")
mod <- readRDS("brms_samples.rds")
bridge_sampler(mod)
Error in out[!from@positive] <- -out[!from@positive] :
NAs are not allowed in subscripted assignments
In addition: Warning messages:
1: effective sample size cannot be calculated, has been replaced by number of samples.
2: 1 of the 60000 log_prob() evaluations on the posterior draws produced -Inf/Inf.
3: 100 of the 60000 log_prob() evaluations on the proposal draws produced -Inf/Inf.
Error: Bridgesampling failed. Perhaps you did not set 'save_pars = save_pars(all = TRUE)' when fitting your model?
Debugging attempt
I've done some digging, but got stuck when I hit rstan code. I'll outline my attempts to trace this back to its origin below. At this point, I'm not even sure, this is a bridgesampling issue. If you can confirm that it's either a rstanor brms issue, I'll be more than happy to raise this at the respective repository.
I have traced the error to a call of bridge_sampler.stanfit(). Specifically, it seems to originate in as.numeric.brob(). It appears that the following lines are the source:
https://github.com/quentingronau/bridgesampling/blob/1b78f76b0536e61a936d24adf9323c797482673e/R/bridge_sampler_internals.R#L270-L271
Further digging shows that this line produces one NA:
https://github.com/quentingronau/bridgesampling/blob/1b78f76b0536e61a936d24adf9323c797482673e/R/bridge_sampler_internals.R#L234
which in turn yields an NA here as well:
https://github.com/quentingronau/bridgesampling/blob/1b78f76b0536e61a936d24adf9323c797482673e/R/bridge_sampler_internals.R#L249
The NA is caused by calculating the difference of two -Inf.
> q11[which(is.na(l1))]
[1] -Inf
> q12[which(is.na(l1))]
[1] -Inf
I suppose it's expected to see -Inf in q11, but, if I understand correctly, q12 originates directly from this call:
https://github.com/quentingronau/bridgesampling/blob/1b78f76b0536e61a936d24adf9323c797482673e/R/bridge_sampler_normal.R#L40
Now, it seems samples_4_iter is constructed from the unconstrained samples.
https://github.com/quentingronau/bridgesampling/blob/1b78f76b0536e61a936d24adf9323c797482673e/R/bridge_sampler.R#L221
Comparing the original to the unconstrained samples, I think, shows the problem: Unconstraining the samples produces an infinite value.
> which(is.infinite(ex), arr.ind = T)
iterations chains parameters
> which(is.infinite(upars), arr.ind = T)
iterations chains
[1,] 131 14398 3
> upars[which(is.infinite(upars), arr.ind = T)]
[1] Inf
Inf yields a density for the multivariate normal of 0, i.e. a log density of -Inf. As an aside, I think, it is also this infinite value that causes the calculation of the effective sample size to fail.
I'm not sure I understand exactly what happens here, but I think all this happens in rstan::unconstrain_pars(), because from what I can tell the generation of skeleton, as well as .rstan_relist() work as intended, whereas the unconstrained samples appear to be missing a third of the parameters. Is this expected behavior?
> str(samples@par_dims[samples@sim$pars_oi])
List of 12
$ b_Intercept: int(0)
$ b : int 1
$ sd_1 : int 2
$ cor_1 : int 1
$ Intercept : int(0)
$ r_1 : int [1:2] 63 2
$ r_1_1 : int 63
$ r_1_2 : int 63
$ lp__ : int(0)
$ z_1 : int [1:2] 2 63
$ L_1 : int [1:2] 2 2
$ Cor_1 : int [1:2] 2 2
> dim(samples)[3]
[1] 393
> str(lapply(skeleton, dim))
List of 12
$ b_Intercept: NULL
$ b : int 1
$ sd_1 : int 2
$ cor_1 : int 1
$ Intercept : NULL
$ r_1 : int [1:2] 63 2
$ r_1_1 : int 63
$ r_1_2 : int 63
$ lp__ : NULL
$ z_1 : int [1:2] 2 63
$ L_1 : int [1:2] 2 2
$ Cor_1 : int [1:2] 2 2
> sum(sapply(lapply(skeleton, dim), prod))
[1] 393
> str(lapply(.rstan_relist(ex[1,1, ], skeleton), dim))
List of 12
$ b_Intercept: NULL
$ b : int 1
$ sd_1 : int 2
$ cor_1 : int 1
$ Intercept : NULL
$ r_1 : int [1:2] 63 2
$ r_1_1 : int 63
$ r_1_2 : int 63
$ lp__ : NULL
$ z_1 : int [1:2] 2 63
$ L_1 : int [1:2] 2 2
$ Cor_1 : int [1:2] 2 2
> sum(sapply(lapply(.rstan_relist(ex[1,1, ], skeleton), dim), prod))
[1] 393
> str(ex)
num [1:20000, 1:6, 1:393] -1.82 -2.13 -1.99 -2.26 -1.83 ...
- attr(*, "dimnames")=List of 3
..$ iterations: NULL
..$ chains : chr [1:6] "chain:1" "chain:2" "chain:3" "chain:4" ...
..$ parameters: chr [1:393] "b_Intercept" "b_drm_lure" "sd_id__Intercept" "sd_id__drm_lure" ...
> str(upars)
num [1:131, 1:20000, 1:6] -0.232 -1.94 -0.51 -1.531 -0.787 ...
- attr(*, "dimnames")=List of 3
..$ : NULL
..$ iterations: NULL
..$ chains : chr [1:6] "chain:1" "chain:2" "chain:3" "chain:4" ...
I found some time to do a little more digging. I now understand that rstan::unconstrain_pars() returns only model parameters (no transformed parameters, etc.). I think this accounts for the reduced number of parameters compared to the original samples (131 vs. 393). So I tried to find out, which of the unconstrained parameters causes the problem. Because the object returned by rstan::unconstrain_pars() does not give parameter names, I manually unconstrained the parameters and, I think, found the culprit to be the correlation between random intercept and slope (the link to download the fit is in my previous post):
ibrary("rstan")
library("brms")
library("bridgesampling")
brms_samples <- readRDS("brms_samples.rds")
# brms internal code
samples <- brms:::restructure(brms_samples)
samples <- brms:::update_misc_env(samples)
samples <- samples$fit
stanfit_model <- samples
# bridgesampling internal code
ex <- rstan::extract(samples, permuted = FALSE)
skeleton <- bridgesampling:::.create_skeleton(samples@sim$pars_oi,
samples@par_dims[samples@sim$pars_oi])
upars <- apply(ex, 1:2, FUN = function(theta) {
rstan::unconstrain_pars(stanfit_model, bridgesampling:::.rstan_relist(theta, skeleton))
})
ex2 <- rstan::extract(samples, c("b", "Intercept", "sd_1", "z_1", "L_1"), permuted = FALSE)
> which(is.infinite(upars), arr.ind = T)
iterations chains
[1,] 131 14398 3
> upars[131, 1, 1]
chain:1
0.734979
> atanh(ex2[1, 1, 132, drop = FALSE])
, , parameters = L_1[2,1]
chains
iterations chain:1
[1,] 0.734979
So the Inf is caused by a MCMC sample for the correlation that is exactly 1:
> ex2[14398, 3, 132]
[1] 1
> atanh(ex2[14398, 3, 132])
[1] Inf
I suppose this issue could be hot-fixed in bridgesampling by replacing Infs in unconstrained correlations by something like atanh(0.9999999999999999444439999), but this seems very undesirable. Moreover, if I undestand correctly, this appears to be a numerical issue that could similarly arise when a positively constrained parameter is sampled to be 0 (which would yield -Inf in the unconstrained samples). So, now I wonder where this issue is best addressed in brms or rstan? Maybe @paul-buerkner would know?
I don't know to be honest. Since the main functions are in rstan, I guess it would be better handled there.