brms icon indicating copy to clipboard operation
brms copied to clipboard

Equality constraints of parameters

Open tfjaeger opened this issue 4 years ago • 10 comments

Hi Paul, following up on this thread about allowing parameters to be constrained to certain values via prior specification: https://github.com/paul-buerkner/brms/issues/783

You pointed out that equality constraints on parameters could be implemented in similar ways. Perhaps one could even allow equality constraints on parameters that include simple functional relations (that are easy to translate into stan)? Either way adding equality constraints via prior specification or otherwise would be a welcome feature!

tfjaeger avatar Apr 30 '20 15:04 tfjaeger

Hey, is there any update on this topic? If not, is there any workaround that can be used provisionally until this functionality is officially provided? Thanks a lot!

Niklas191 avatar Feb 06 '21 18:02 Niklas191

No update. I have a clear idea of how the user-interface via the prior argument should look like but the internal implementation is much trickier because how how prior is accessed and interpreted internally. I will have to figure out the full scope of this feature first before I can think about the best implementation.

paul-buerkner avatar Feb 07 '21 13:02 paul-buerkner

Thanks Paul!

Just in case someone else is looking for a temporary solution: A nice workaround using a non-linear formula for specifying the equality constraints is described here.

Niklas191 avatar Feb 07 '21 15:02 Niklas191

Just to state why I opened this issue originally. The approach @Niklas191 linked to can handle equality of parameters but I don't see how it can handle equality constraints like b_k = 3 * b_j.

tfjaeger avatar Feb 07 '21 17:02 tfjaeger

I think you can actually by relating non-linear parameters to each other. e.g. nlf(bk ~ 3 * bj)

paul-buerkner avatar Feb 07 '21 18:02 paul-buerkner

I was wondering whether that works, and how general it would be. So, say I have a model like the following:

bf(respond_t ~ log(inv_logit(bvot) + ((1-inv_logit(bvot)) * inv_logit(bcontext)))
                        - log((1-inv_logit(bvot)) + (inv_logit(bvot)) * (1-inv_logit(bcontext))), 
                        bvot ~ 1 + vot.1 + vot.2,
                        bcontext ~ 0 + bias.numeric,
                        nl = TRUE)

and I'd like to express that bvot = bcontext * -2, all I'd have to do is to add + nlf(bvot ~ bcontext * -2) after the call to bf()? I didn't think this would be possible. I don't see any examples in the helpfile that use nlf() to relate two parameters that are part of another formula to each other. I only see examples that define an nlf() the elements of which are then described by other formulas.

I guess in any case, that would still leave inequality constraints like bvot > bcontext * -2 (i.e., lower/upper bounds that refer to other parameters). Note sure whether you were planning to allow those? Should I open that as a separate issue?

tfjaeger avatar Feb 07 '21 21:02 tfjaeger

Yes this is the right way to go for equality constraints right now. I dont have any plans for unequality constraints at the moment.

tfjaeger [email protected] schrieb am So., 7. Feb. 2021, 22:25:

I was wondering whether that works, and how general it would be. So, say I have a model like the following:

bf(respond_t ~ log(inv_logit(bvot) + ((1-inv_logit(bvot)) * inv_logit(bcontext))) - log((1-inv_logit(bvot)) + (inv_logit(bvot)) * (1-inv_logit(bcontext))), bvot ~ 1 + vot.1 + vot.2, bcontext ~ 0 + bias.numeric, nl = TRUE)

and I'd like to express that bvot = bcontext * -2, all I'd have to do is to add + nlf(bvot ~ bcontext * -2) after the call to bf()? I didn't think this would be possible. I don't see any examples in the helpfile that use nlf() to relate two parameters that are part of another formula to each other. I only see examples that define an nlf() the elements of which are then described by other formulas.

I guess in any case, that would still leave inequality constraints like bvot > bcontext * -2 (i.e., lower/upper bounds that refer to other parameters). Note sure whether you were planning to allow those? Should I open that as a separate issue?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/888#issuecomment-774770641, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCW2AFYB45QP5V667T7OWLS54ALPANCNFSM4MVV43FQ .

paul-buerkner avatar Feb 08 '21 08:02 paul-buerkner

Hi all, I thought I should share an alternative approach that can be implemented via prior specification.

The key idea comes from observing that brms will insert whatever it finds inside a constant()prior into the Stan code, so this can be used to pass parameter names (or any other expression more generally).

Because we can avoid using the non-linear syntax, resulting formulas are more readable. The downside is that we need to look into the Stan code to find out how brms names parameters internally, but I find this to be a worthwhile tradeoff.

Here's an example for setting a constraint between regression coefficients:

# Setup
library(brms)

# Model formula
mtform <- bf(mpg ~ wt + gear + qsec)

# Examine the model code
make_stancode(mtform, mtcars)
#> parameters {
#>   vector[Kc] b;  // population-level effects
#>   real Intercept;  // temporary intercept for centered predictors
#>   real<lower=0> sigma;  // dispersion parameter
#> }
#> transformed parameters {
#>   real lprior = 0;  // prior contributions to the log posterior
#>   lprior += student_t_lpdf(Intercept | 3, 19.2, 5.4);
#>   lprior += student_t_lpdf(sigma | 3, 0, 5.4)
#>     - 1 * student_t_lccdf(0 | 3, 0, 5.4);
#> }
#> model {
#>   // likelihood including constants
#>   if (!prior_only) {
#>     target += normal_id_glm_lpdf(Y | Xc, Intercept, b, sigma);
#>   }
#>   // priors including constants
#>   target += lprior;
#> }

The code above tells us the coefficients are stored in the b vector. Let's say we want to make the coefficient for qsec be twice that of gear. We might first try this:

# Use "constant" prior to insert parameter name
mtprior <- get_prior(formula = mtform, data = mtcars)
mtprior[mtprior$coef == "qsec", "prior"] <- "constant(2*b[2])"
mtprior
#>                    prior     class coef group resp dpar nlpar lb ub
#>                   (flat)         b                                 
#>                   (flat)         b gear                            
#>         constant(2*b[2])         b qsec                            
#>                   (flat)         b   wt                            
#>  student_t(3, 19.2, 5.4) Intercept                                 
#>     student_t(3, 0, 5.4)     sigma                             0   

Notice that brms shows the priors table with variables sorted alphabetically, but the actual parameters are assigned in the order in which the terms appear in the formula, which is why I use b[2] for gear's coefficient.

We verify how this prior modifies the resulting code:

make_stancode(mtform, mtcars, prior = mtprior)
#> parameters {
#>   real par_b_1;
#>   real par_b_2;
#>   real Intercept;  // temporary intercept for centered predictors
#>   real<lower=0> sigma;  // dispersion parameter
#> }
#> transformed parameters {
#>   vector[Kc] b;  // population-level effects
#>   real lprior = 0;  // prior contributions to the log posterior
#>   b[3] = b[2];
#>   b[1] = par_b_1;
#>   b[2] = par_b_2;
#>   lprior += student_t_lpdf(Intercept | 3, 19.2, 5.4);
#>   lprior += student_t_lpdf(sigma | 3, 0, 5.4)
#>     - 1 * student_t_lccdf(0 | 3, 0, 5.4);
#> }

A little problem here is that the structure has changed. As far as brms is concerned, b is now a mix of constants and parameters. However, "constant" priors are placed before other assignments, so this will set b[3] = b[2] before b[2] is defined and will result in an error if we try to run it:

brm(mtform, mtcars, prior = mtprior)
#> Chain 1: Rejecting initial value:
#> Chain 1:   Error evaluating the log probability at the initial value.
#> Chain 1: Exception: normal_id_glm_lpdf: Weight vector[3] is nan, but must be finite!
#> error occurred during calling the sampler; sampling not done

But now that we know how our prior changes the resulting code, we can refer to our desired parameter by its new name:

mtprior[mtprior$coef == "qsec", "prior"] <- "constant(2*par_b_2)"

# Done!
brm(mtform, mtcars, prior = mtprior)
#> Population-Level Effects: 
#>           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept    15.36      6.37     3.03    27.58 1.00     3648     2893
#> wt           -4.79      0.51    -5.80    -3.76 1.00     4023     2947
#> gear          0.51      0.14     0.24     0.79 1.00     3777     3062
#> qsec          1.02      0.29     0.47     1.59 1.00     3777     3062
#> 
#> Family Specific Parameters: 
#>       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma     2.66      0.37     2.06     3.48 1.00     3416     2342
#> 
#> Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).

(sorry for bumping an old issue but this is currently the top Google result for "brms equality constraint" so it might be helpful to someone)

bmfazio avatar Apr 28 '23 09:04 bmfazio

Thank you @bmfazio . I guess a similar approach would lend itself to implementing inequality constraints, just that one would have to edit the stancode directly, right? E.g., for b[3] > b[2] * -2

#> parameters { #> real par_b_1; #> real par_b_2; #> real par_b_3<lower=0>; #> real Intercept; // temporary intercept for centered predictors #> real<lower=0> sigma; // dispersion parameter #> } #> transformed parameters { #> vector[Kc] b; // population-level effects #> real lprior = 0; // prior contributions to the log posterior #> b[1] = par_b_1; #> b[2] = par_b_2; #> b[3] = -2 * par_b_2 + par_b_3; #> lprior += student_t_lpdf(Intercept | 3, 19.2, 5.4); #> lprior += student_t_lpdf(sigma | 3, 0, 5.4) #> - 1 * student_t_lccdf(0 | 3, 0, 5.4); #> }

tfjaeger avatar May 07 '23 14:05 tfjaeger

That seems reasonable. Of course, introducing more complex relations between parameters means extra care will be needed to make sure our priors still have sensible implications.

AFAIK brms doesn't have a way to directly edit model code, but here is one way to do it:

library(brms)
# Model formula
mtform <- bf(mpg ~ wt + gear + qsec)
# Set prior
mtprior <- get_prior(formula = mtform, data = mtcars)
mtprior[mtprior$coef == "qsec", "prior"] <- "constant(2*par_b_2)"
# Fit model
fit1 <- brm(mtform, mtcars, prior = mtprior)
# Edit code
new_code <- capture.output(stancode(fit1))
new_code <- paste(c(new_code[1:22],
                    "  real<lower=0> par_b_3;",
                    new_code[23:28], new_code[30:31],
                    "  b[3] = par_b_2 - par_b_3;",
                    new_code[-c(1:31)]),
                  collapse = "\n")
# Replace the model object
fit1$model <- structure(new_code, class = c("character", "brmsmodel"))
fit1$fit@stanmodel <- rstan::stan_model(model_code = new_code)
# Fit with modified model
fit2 <- update(fit1, recompile = FALSE)
# Verify effect on posterior
bayesplot::mcmc_pairs(fit2)

image

P.S. In case backend = "cmdstanr" was used, a slight tweak to the model replacement line is needed:

attributes(fit1$fit)$CmdStanModel <- cmdstan_model(write_stan_file(new_code))

bmfazio avatar May 09 '23 12:05 bmfazio