parameters icon indicating copy to clipboard operation
parameters copied to clipboard

Mislabelled output for `BayesFactor::regressionBF()` models?

Open profandyfield opened this issue 8 months ago • 16 comments

With regressionBF, if I inspect a model directly I get

require(discovr)
require(BayesFactor)

album_tib <- discovr::album_sales
album_bf <- BayesFactor::regressionBF(sales ~ adverts + airplay + image, rscaleCont = "medium", data = album_tib)
album_bf


  |==================================================================================================| 100%
Bayes factor analysis
--------------
[1] adverts                   : 1.320123e+16 ±0%
[2] airplay                   : 4.723817e+17 ±0.01%
[3] image                     : 6039.289     ±0%
[4] adverts + airplay         : 5.65038e+39  ±0%
[5] adverts + image           : 2.65494e+20  ±0%
[6] airplay + image           : 1.034464e+20 ±0%
[7] adverts + airplay + image : 7.746101e+42 ±0%

Against denominator:
  Intercept only 
---
Bayes factor type: BFlinearModel, JZS

but with model_parameters(album_bf) I get:

model_parameters(album_bf)


Multiple `BFBayesFactor` models detected - posteriors are extracted from the first numerator
  model.
  See help("get_parameters", package = "insight").
# Extra Parameters 

Parameter |  Median |             95% CI |   pd |       BF
----------------------------------------------------------
mu        |  193.03 | [ 183.63,  202.51] | 100% | 1.32e+16
adverts   |    0.09 | [   0.08,    0.11] | 100% | 4.72e+17
sig2      | 4389.28 | [3629.69, 5366.11] | 100% | 6.04e+03
g         |    0.42 | [   0.08,   12.32] | 100% | 5.65e+39

# Fixed Effects 

Parameter |       BF
--------------------
adverts   | 2.65e+20
airplay   | 1.03e+20
image     | 7.75e+42

The values in column BF map onto the output of album_bf but the labels in Parameter do not. Am I misunderstanding the labels, or is model_parameters() mis-labelling? [For the record I'm using model_parameters() to get nice output and because I want students to learn a consistent workflow with all models.]

profandyfield avatar Apr 17 '25 15:04 profandyfield

What about the bayestestR functions?

strengejacke avatar Apr 17 '25 19:04 strengejacke

You mean, what about using them instead? I don't have space to properly talk about priors so I need something that has defaults comparable to BayesFactor (which I'm useing as a gateway drug for the reader!) I don't have space to get into stuff like brms or stan. If you can point me to something that shows how to mimic Bayesfactor functions using bayestestR then I'll take a look. I couldn't find anything obvious on the bayestestR website.

I should also add that @DominiqueMakowski is exerting considerable pressure to do everything with model-parameters() (and I think his point is quite compelling in terms of students having to learn a workflow that they can apply to almost any model).

profandyfield avatar Apr 17 '25 20:04 profandyfield

What about the bayestestR functions?

Haha I literally said to andy "use parameters instead of bayestestR for that"

But regardless, same issue:

> bayestestR::describe_posterior(album_bf)
Multiple `BFBayesFactor` models detected - posteriors are extracted from
  the first numerator model.
  See help("get_parameters", package = "insight").
Summary of Posterior Distribution

Parameter       |  Median |             95% CI |   pd |          ROPE
---------------------------------------------------------------------
mu              |  193.14 | [ 183.86,  202.58] | 100% | [-8.07, 8.07]
adverts         |    0.09 | [   0.08,    0.11] | 100% | [-8.07, 8.07]
sig2            | 4389.83 | [3631.90, 5373.08] | 100% | [-8.07, 8.07]
g               |    0.44 | [   0.08,   11.72] | 100% | [-8.07, 8.07]
adverts-adverts |         |                    |      |              
airplay-airplay |         |                    |      |              
image-image     |         |                    |      |              

Parameter       | % in ROPE |       BF | Prior
----------------------------------------------
mu              |        0% | 1.32e+16 |      
adverts         |      100% | 1.32e+16 |      
sig2            |        0% | 1.32e+16 |      
g               |    98.76% | 1.32e+16 |      
adverts-adverts |           |          |      
airplay-airplay |           |          |      
image-image     |           |          |

DominiqueMakowski avatar Apr 17 '25 20:04 DominiqueMakowski

We probably never implemented full support for RegressionBF in insight:

> insight::find_parameters(album_bf)
$conditional
[1] "adverts-adverts" "airplay-airplay" "image-image"    

$extra
[1] "mu"      "adverts" "sig2"    "g"

DominiqueMakowski avatar Apr 17 '25 20:04 DominiqueMakowski

Yes, I think so, too

strengejacke avatar Apr 18 '25 06:04 strengejacke

This is definitely wrong. I'll take a look.

(However, @profandyfield might I suggest not teaching the {BayesFactor} package for inference anything more complex than a correlation/contingency table/t test 📛)

mattansb avatar Apr 18 '25 06:04 mattansb

I think the only context - other rthan those you list - that I use it is comparing linear models (as in the example above). For this case what would you suggest instead? (Bearing in mind the aim is as a gateway to more sophisticated approaches should the user be convinced to find out more about Bayesian methods.)

profandyfield avatar Apr 18 '25 07:04 profandyfield

The bf_*() functions in bayestestR should work. But testing parameters only works for Bayesian models

strengejacke avatar Apr 18 '25 08:04 strengejacke

Sidetracking the original issue, but I also only use {BayesFactor} exclusively for t-tests & correlations, and find BayesFactor::regressionBF confusing: you would expect that it would be consistent with other "regression" functions (lm, glm) and return parameters of the specified model, but it does something quite different and the output is quite confusing

I would also just not use BFs for anything else than simple tests, and simply signpost that doing Bayesian regressions requires a bit more thought / different approach and is outside the scope of the module...

DominiqueMakowski avatar Apr 18 '25 08:04 DominiqueMakowski

I just saw that the BF function does a combined all variables and indeed tests models.

In this case, you could fit the single models and use bf_models().

strengejacke avatar Apr 18 '25 09:04 strengejacke

(and that might be the reason why parameters or insight fail, because of the dynamic output which we haven't taken into consideration yet)

strengejacke avatar Apr 18 '25 09:04 strengejacke

album_tib <- discovr::album_sales

lm0 <- lm(sales ~ 1, data = album_tib)
lm1 <- lm(sales ~ adverts, data = album_tib)
lm2 <- lm(sales ~ airplay, data = album_tib)
lm3 <- lm(sales ~ image, data = album_tib)
lm4 <- lm(sales ~ adverts + airplay, data = album_tib)
lm5 <- lm(sales ~ adverts + image, data = album_tib)
lm6 <- lm(sales ~ airplay + image, data = album_tib)
lm7 <- lm(sales ~ adverts + airplay + image, data = album_tib)

bayestestR::bf_models(lm0, lm1, lm2, lm3, lm4, lm5, lm6, lm7, denominator = 1)
#> Bayes Factors for Model Comparison
#> 
#>       Model                           BF
#> [lm1] adverts                   3.50e+16
#> [lm2] airplay                   1.39e+18
#> [lm3] image                     5.40e+03
#> [lm4] adverts + airplay         6.24e+40
#> [lm5] adverts + image           7.11e+20
#> [lm6] airplay + image           2.67e+20
#> [lm7] adverts + airplay + image 1.00e+44
#> 
#> * Against Denominator: [lm0] (Intercept only)
#> *   Bayes Factor Type: BIC approximation

Created on 2025-04-18 with reprex v2.1.1

strengejacke avatar Apr 18 '25 09:04 strengejacke

@profandyfield For a soft entry into Bayesian estimation, I would avoid {BayesFactor} because the parameterization there is non-standard, and it has very little support (plotting, emmeans/marginaleffects, etc...) - instead I would teach {rstanarm} (though I typically skip straight to {brms}). For Bayesian model comparisons, I would also avoid {BayesFactor} because it is limited only to linear models (and also there is much debate regarding the validity of "default priors" for testing in complex models). A very soft entry would be the BIC approximations, as demonstrated above by @strengejacke - it is easy to already do if you know how to fit models, and widely applicable. For BFs for specific hypothesis, I would also switch to {rstanarm}/{brms} + {bridgesampling} (which we wrap in bayestestR with bayesfactor_models()).


Regarding your issue, parameters::model_parameters() and bayestestR::describe_posterior() both have a BF column, but it is not the same - in bayestestR::describe_posterior() it is repreating the BF of the selected (first) model compared to the null, while in parameters::model_parameters() it is the BF table.

I think we discussed this elsewhere, but I think in both cases the BF column is inappropriate as it implies these are parameter-specific Bayes factors (such as those given by bayesfactor_parameters()), which they are not.

We should be returning information by all the parameters provided by BayesFactor::posterior() other than g or those starting with g_.

mattansb avatar Apr 18 '25 18:04 mattansb

Thanks for the useful comments @mattansb. Given my time constraints, I will use the BIC approximations in the first instance but have made a note-to-self to look at rstanarm (which I haven't used but looks interesting) in more detail and come back to it later in the writing cycle if there's time.

profandyfield avatar Apr 18 '25 22:04 profandyfield

@profandyfield I used the Regression and Other Stories textbook when I taught intro regression. It uses rstanarm for all of its code examples and I think it's a great text for folks new to modeling in general or to Bayesian modeling in particular.

bwiernik avatar Apr 24 '25 21:04 bwiernik

That's a convenient recommendation @bwiernik 😀image

profandyfield avatar Apr 25 '25 06:04 profandyfield