SBC icon indicating copy to clipboard operation
SBC copied to clipboard

Issues with generated datasets of ordinal data for a cumulative probit model when running compute_SBC()

Open aftmnelbier11 opened this issue 3 months ago • 5 comments

I am encountering an errors regarding the priors when running compute_SBC() with generated datasets for a cumulative probit model. A bit of info about the data structure: I have an ordinal outcome variable with 7 levels. There are 61 responses per dataset.

It seems the error stems from some generated datasets not containing all response levels. Importantly, I have set the intercept priors individually, so when one level isn't included in the generated dataset, the error is thrown.

Is there a way to specify that all levels are generated in each simulation? Alternatively, would it make to increase the number of generated datasets and kick out those simulations that have this error?

Thanks in advance!

The error (which I get for multiple simulations and sometimes for multiple intercepts):

Simulation 3 resulted in error when fitting.

Error: The following priors do not correspond to any model parameter: 
Intercept_6 ~ normal(1.07, 1)
Function 'get_prior' might be helpful to you.

The code:

## set contrasts (sum-coding)
contrasts(df$group) = contr.sum(2)

##develop formula
bf.form = brms::bf(rating ~ group + cont1 + cont2 + 
                     group:cont1  + group:cont2)
code = "INT"

get_prior(bf.form, data = df, family = cumulative('probit'))

tibble(rating = 1:7) %>% 
  mutate(proportion = 1/7) %>% 
  mutate(cumulative_proportion = cumsum(proportion)) %>% 
  mutate(right_hand_threshold = qnorm(cumulative_proportion))

priors = c(
  prior(normal(-1.07, 1), class = Intercept, coef = 1),
  prior(normal(-0.57, 1), class = Intercept, coef = 2),
  prior(normal(-0.18, 1), class = Intercept, coef = 3),
  prior(normal(0.18, 1), class = Intercept, coef = 4),
  prior(normal(0.57, 1), class = Intercept, coef = 5),
  prior(normal(1.07, 1), class = Intercept, coef = 6),
  prior(normal(0, 1), class = b) 
)

##simulate data
generator <- SBC_generator_brms(bf.form, data = df, 
                                family = cumulative('probit'), init = 0.1, prior = priors,
                                thin = 50, warmup = 10000, refresh = 2000,
                                # Will generate the log density - this is useful, 
                                #but a bit computationally expensive - turned off
                                generate_lp = FALSE)

datasets <- generate_datasets(generator, 100)

backend <- SBC_backend_brms_from_generator(generator, chains = 4, thin = 1,
                                           warmup = 1500, iter = 6000,               
                                           init = 0.1) 

results <- compute_SBC(datasets, backend,
                       cache_mode = "results", 
                       cache_location = file.path(cache_dir, code))

aftmnelbier11 avatar Mar 13 '24 14:03 aftmnelbier11