bayestestR
bayestestR copied to clipboard
rope range for linear models
For logistic regression, the rope range is a function of 3/sqrt(pi), as this is the sd of the logistic distribution (on the latent scale).
However, this is the conditional sd, which is akin to sigma in a linear model. But in a linear model, the rope range if a function of the marginal sd, no?
This seems odd to me. Have I missed something?
mmh I think the rationale we followed back then was quite simple: finding the equivalent of 0.1 * SD, and the most straightforward way was using the d -> log odds conversion formula... but indeed, there might be more appropriate ways of tackling this, what would you recommend?
I thought this was based on Kruschke's book, no?
afaik he only gave recommendation for linear models and from there we derived the logistic one
afaik he only gave recommendation for linear models and from there we derived the logistic one
No, see
(from Supplement https://doi.org/10.1177/2515245918771304)
Reminder for ordinal regression: https://twitter.com/ChelseaParlett/status/1352367383243939841?s=19
https://github.com/easystats/bayestestR/issues/20
It is sigma!
So... "fix" rope_range()
with a startup warning?...
It seems that he also gives the 0.1*sd(y)
as another option?
How very weird - these two options give very different results!
From my empirical experience, I felt like the current default rope range for logistic was "a bit larger" than for lms (i.e., it was a bit harder to be outside the rope given similarish effects). Is the new value usually bigger or smaller than the old?
1. The conditional sd will be smaller than the unconditional sd.
So for the logistic case, the default we use is expected to be rather small - a change of of 0.1*3/sqrt(pi) = 0.18
~ OR: 1.2
is indeed rather small.
2. We have an inconsistency in the default rope range between models:
For linear models we use the unconditional, and for logistic we use conditional:
Unconditional | Conditional | |
---|---|---|
Logistic (any link = "logit" ) |
--none-- | 3/sqrt(pi) |
Gaussian (really any link = "identity" ) |
Sy = sd(insight::get_response(model)) |
sigma(model) |
get_results <- function(n=100, d=0.5){
df <- effectsize::standardize(bayestestR::simulate_difference(n=n, d=d))
m1 <- glm(V0 ~ V1, data=df, family="binomial")
m2 <- lm(V1 ~ V0, data=df)
data.frame(n=c(n, n, n),
d=c(d, d, d),
type=c("LM", "GLM - basic", "GLM - sigma"),
rope=c(bayestestR::rope_range(m2)[2], 0.1 * pi / sqrt(3), 0.1 * insight::get_sigma(m1)),
coef=insight::get_parameters(m2)[2, 2], insight::get_parameters(m1)[2, 2], insight::get_parameters(m1)[2, 2])
}
data <- data.frame()
for(d in seq(0, 1, length.out = 100)){
data <- rbind(data, get_results(d=d))
}
library(ggplot2)
ggplot(data, aes(x=d, y=rope, color=type)) +
geom_point()
Created on 2021-01-26 by the reprex package (v0.3.0)
Indeed, using sigma would result in mostly smaller ROPEs it seems. But I still have troubles with the fact that it depends on the number of 1s and 0s in the dependent variable when the logodds obtained from logistic models are supposed to be unrelated to such properties of the outcome (in their interpretation). I might be missing something out, though
If we have enough evidence to replace the current fixed value by 0.1 * sigma then okay.
Let's first finally submit bayestestR as is, and then for next version we address this + finally the CI range 😬
we can add a warning for the rope_range of logistic models
@DominiqueMakowski not 100% sure what you're simulation was doing, but you're (again 😂) using the OR to d conversation wrong - it's not for predicting group from y in a logistic regression, it's for predicting a dichotomized y from group in a logistic regression.
Also, the warnings shouldn't be for logistic models (as I noted, there is no unconditional alternative for 3/sqrt(pi)
), but for linear models, where we might change from the current unconditional to the conditional.
This is what I'm trying to say:
get_results <- function(n=100, d=0.5){
df <- bayestestR::simulate_difference(n=n, d=d)
m1 <- glm(I(V1>median(V1)) ~ V0, data=df, family="binomial")
m2 <- lm(V1 ~ V0, data=df)
data.frame(n=c(n, n, n),
d=c(d, d, d),
type=c("LM - current (S)", "LM - sigma", "GLM (rescaled)"),
rope=c(bayestestR::rope_range(m2)[2], 0.1 * insight::get_sigma(m2), 0.1), # rescaled, 0.1 * pi / sqrt(3) is just 0.1
coef=c(insight::get_parameters(m2)[2, 2], insight::get_parameters(m2)[2, 2], insight::get_parameters(m1)[2, 2]))
}
library(purrr)
data <- map_dfr(seq(0, 2, length.out = 50), get_results, n = 100)
library(ggplot2)
ggplot(data, aes(x=d, y=rope, color=type)) +
geom_line(size = 2) +
coord_cartesian(ylim = c(0.05,0.15))
Created on 2021-01-26 by the reprex package (v0.3.0)
aaaaaah I entirely misread this thread then
lol
My priors made me think it was a rehash of #20, my deepest apologies 😅
Classic example of bad priors :P
I cleaned-up the code a bit, and put the warning in the right place. We can finish resolving before the next CRAN release.
re-submitting, then
resubmitted ^^
Dear maintainer,
package bayestestR_0.8.2.tar.gz does not pass the incoming checks automatically, please see the following pre-tests: Windows: https://win-builder.r-project.org/incoming_pretest/bayestestR_0.8.2_20210126_062500/Windows/00check.log Status: OK Debian: https://win-builder.r-project.org/incoming_pretest/bayestestR_0.8.2_20210126_062500/Debian/00check.log Status: 1 NOTE
Because of the note?? The URL works fine...
Do you have an attached text file that also contains another check log?
Lol dont worry I just pasted that here for reference but I'll look into it as soon as I reach soon :)
I'll just reply say it's prolly a false positive :)
Or remove link, merge https://github.com/easystats/bayestestR/pull/367, and re-submit 8-)
good idea, though I already replied, I'll resubmit nonetheless
re-re-submitted 🤞
On CRAN