posteriordb Store the actual posterior dimension

One measure of how difficult a posterior is, is the number of dimensions, For example posterior_database/posteriors/GLM_Poisson_Data-GLM_Poisson_model.json reports "dimensions",

{
  "keywords": ["bpa book", "Poisson model"],
  "urls": "https://github.com/stan-dev/example-models/tree/master/BPA/Ch.03",
  "references": "kery2011population",
  "dimensions": {
    "alpha": 1,
    "beta1": 1,
    "beta2": 1,
    "beta3": 1,
    "log_lambda": 40,
    "lambda": 40
  },
  "reference_posterior_name": null,
  "added_date": "2021-07-01",
  "added_by": "Kane Lindsay",
  "name": "GLM_Poisson_Data-GLM_Poisson_model",
  "model_name": "GLM_Poisson_model",
  "data_name": "GLM_Poisson_Data"
}

but looking at the code, these "dimensions" include transformed parameters and generated quantities which have high dimensions, but not influence how difficult the posterior is

parameters {
  real<lower=-20, upper=20> alpha;
  real<lower=-10, upper=10> beta1;
  real<lower=-10, upper=10> beta2;
  real<lower=-10, upper=10> beta3;
}
transformed parameters {
  vector[n] log_lambda;
  
  log_lambda = alpha + beta1 * year + beta2 * year_squared
               + beta3 * year_cubed;
}
generated quantities {
  vector[n] lambda;
  
  lambda = exp(log_lambda);
}

It would be good to report the actual posterior dimensionality.

Jun 03 '25 19:06 avehtari

I agree. How should we do this in the best way?

Maybe add ”posterior dimension” as a slot and only add those parameters thats in the parameter block. We would also need to handle the parameter types (like simplex, covariance matrix etc). However this would probably not be that difficult.

Something like:

  "posterior_dimensions": {
    "alpha": {”real”:1},
    "beta1": {”real”:1},
    "beta2": {”real”:1},
    "beta3": {”real”:1}
  }

Jun 04 '25 05:06 MansMeg

It could also be just one number matching the dimensionality of unconstrained space

Jun 04 '25 06:06 avehtari

In R we can first create a fit object and get some valid values for the parameters

po <- posterior("GLM_Poisson_Data-GLM_Poisson_model", pdb)
mod <- cmdstan_model(stan_file = stan_code_file_path(po))
fit <- mod$sample(data=pdb_data(po), init=0.01, iter_warmup=1, iter_sampling=1, chains=1, refresh=0, show_messages=FALSE, show_exceptions=FALSE, diagnostics=NULL, sig_figs = 12)
pars <- names(fit$variable_skeleton(transformed_parameters = FALSE, generated_quantities = FALSE))
drs <- fit$draws()
vars <- sapply(pars, \(par) as.numeric(subset_draws(drs, variable=par)), simplify=FALSE, USE.NAMES=TRUE)

and then get the dimensions of constrained space

length(unlist(vars))

and the dimensions of unconstrained space

length(fit$unconstrain_variables(vars))

Jun 04 '25 07:06 avehtari

posteriordb posteriordb copied to clipboard

Store the actual posterior dimension

posteriordb
posteriordb copied to clipboard