bayestestR rope range for linear models

rope range for linear models

Open mattansb opened this issue 3 years ago • 56 comments

For logistic regression, the rope range is a function of 3/sqrt(pi), as this is the sd of the logistic distribution (on the latent scale).

However, this is the conditional sd, which is akin to sigma in a linear model. But in a linear model, the rope range if a function of the marginal sd, no?

This seems odd to me. Have I missed something?

Jan 22 '21 11:01 mattansb

mmh I think the rationale we followed back then was quite simple: finding the equivalent of 0.1 * SD, and the most straightforward way was using the d -> log odds conversion formula... but indeed, there might be more appropriate ways of tackling this, what would you recommend?

Jan 24 '21 07:01 DominiqueMakowski

I thought this was based on Kruschke's book, no?

Jan 24 '21 12:01 mattansb

afaik he only gave recommendation for linear models and from there we derived the logistic one

Jan 24 '21 13:01 DominiqueMakowski

afaik he only gave recommendation for linear models and from there we derived the logistic one

No, see

(from Supplement https://doi.org/10.1177/2515245918771304)

Jan 24 '21 13:01 strengejacke

Reminder for ordinal regression: https://twitter.com/ChelseaParlett/status/1352367383243939841?s=19

Jan 24 '21 13:01 strengejacke

https://github.com/easystats/bayestestR/issues/20

Jan 24 '21 13:01 DominiqueMakowski

It is sigma!

So... "fix" rope_range() with a startup warning?...

Jan 24 '21 14:01 mattansb

It seems that he also gives the 0.1*sd(y) as another option?

How very weird - these two options give very different results!

Jan 24 '21 15:01 mattansb

From my empirical experience, I felt like the current default rope range for logistic was "a bit larger" than for lms (i.e., it was a bit harder to be outside the rope given similarish effects). Is the new value usually bigger or smaller than the old?

Jan 24 '21 23:01 DominiqueMakowski

1. The conditional sd will be smaller than the unconditional sd.

So for the logistic case, the default we use is expected to be rather small - a change of of 0.1*3/sqrt(pi) = 0.18 ~ OR: 1.2 is indeed rather small.

2. We have an inconsistency in the default rope range between models:

For linear models we use the unconditional, and for logistic we use conditional:

	Unconditional	Conditional
Logistic (any `link = "logit"`)	--none--	`3/sqrt(pi)`
Gaussian (really any `link = "identity"`)	S_y = `sd(insight::get_response(model))`	`sigma(model)`

Jan 25 '21 07:01 mattansb

get_results <- function(n=100, d=0.5){
  df <- effectsize::standardize(bayestestR::simulate_difference(n=n, d=d))
  
  m1 <- glm(V0 ~ V1, data=df, family="binomial")
  m2 <- lm(V1 ~ V0, data=df)
                    
  data.frame(n=c(n, n, n),
             d=c(d, d, d), 
             type=c("LM", "GLM - basic", "GLM - sigma"),
             rope=c(bayestestR::rope_range(m2)[2], 0.1 * pi / sqrt(3), 0.1 * insight::get_sigma(m1)),
             coef=insight::get_parameters(m2)[2, 2], insight::get_parameters(m1)[2, 2], insight::get_parameters(m1)[2, 2])
}

data <- data.frame()
for(d in seq(0, 1, length.out = 100)){
  data <- rbind(data, get_results(d=d))
}

library(ggplot2)

ggplot(data, aes(x=d, y=rope, color=type)) +
  geom_point()

^{Created on 2021-01-26 by the reprex package (v0.3.0)}

Indeed, using sigma would result in mostly smaller ROPEs it seems. But I still have troubles with the fact that it depends on the number of 1s and 0s in the dependent variable when the logodds obtained from logistic models are supposed to be unrelated to such properties of the outcome (in their interpretation). I might be missing something out, though

If we have enough evidence to replace the current fixed value by 0.1 * sigma then okay.

Let's first finally submit bayestestR as is, and then for next version we address this + finally the CI range 😬

Jan 26 '21 02:01 DominiqueMakowski

we can add a warning for the rope_range of logistic models

Jan 26 '21 02:01 DominiqueMakowski

@DominiqueMakowski not 100% sure what you're simulation was doing, but you're (again 😂) using the OR to d conversation wrong - it's not for predicting group from y in a logistic regression, it's for predicting a dichotomized y from group in a logistic regression.

Also, the warnings shouldn't be for logistic models (as I noted, there is no unconditional alternative for 3/sqrt(pi)), but for linear models, where we might change from the current unconditional to the conditional.

Jan 26 '21 04:01 mattansb

This is what I'm trying to say:

get_results <- function(n=100, d=0.5){
  df <- bayestestR::simulate_difference(n=n, d=d)
  
  m1 <- glm(I(V1>median(V1)) ~ V0, data=df, family="binomial")
  m2 <- lm(V1 ~ V0, data=df)
  
  data.frame(n=c(n, n, n),
             d=c(d, d, d), 
             type=c("LM - current (S)", "LM - sigma", "GLM (rescaled)"),
             rope=c(bayestestR::rope_range(m2)[2], 0.1 * insight::get_sigma(m2), 0.1), # rescaled, 0.1 * pi / sqrt(3) is just 0.1
             coef=c(insight::get_parameters(m2)[2, 2], insight::get_parameters(m2)[2, 2], insight::get_parameters(m1)[2, 2]))
}

library(purrr)
data <- map_dfr(seq(0, 2, length.out = 50), get_results, n = 100)


library(ggplot2)

ggplot(data, aes(x=d, y=rope, color=type)) +
  geom_line(size = 2) +
  coord_cartesian(ylim = c(0.05,0.15))

^{Created on 2021-01-26 by the reprex package (v0.3.0)}

Jan 26 '21 04:01 mattansb

aaaaaah I entirely misread this thread then

Jan 26 '21 04:01 DominiqueMakowski

lol

Jan 26 '21 04:01 mattansb

My priors made me think it was a rehash of #20, my deepest apologies 😅

Jan 26 '21 04:01 DominiqueMakowski

Classic example of bad priors :P

Jan 26 '21 04:01 mattansb

I cleaned-up the code a bit, and put the warning in the right place. We can finish resolving before the next CRAN release.

Jan 26 '21 04:01 mattansb

re-submitting, then

Jan 26 '21 05:01 DominiqueMakowski

resubmitted ^^

Jan 26 '21 05:01 DominiqueMakowski

Dear maintainer,

package bayestestR_0.8.2.tar.gz does not pass the incoming checks automatically, please see the following pre-tests: Windows: https://win-builder.r-project.org/incoming_pretest/bayestestR_0.8.2_20210126_062500/Windows/00check.log Status: OK Debian: https://win-builder.r-project.org/incoming_pretest/bayestestR_0.8.2_20210126_062500/Debian/00check.log Status: 1 NOTE

Jan 26 '21 08:01 DominiqueMakowski

Because of the note?? The URL works fine...

Jan 26 '21 08:01 mattansb

Do you have an attached text file that also contains another check log?

Jan 26 '21 09:01 strengejacke

Lol dont worry I just pasted that here for reference but I'll look into it as soon as I reach soon :)

Jan 26 '21 09:01 DominiqueMakowski

I'll just reply say it's prolly a false positive :)

Jan 26 '21 14:01 DominiqueMakowski

Or remove link, merge https://github.com/easystats/bayestestR/pull/367, and re-submit 8-)

Jan 26 '21 14:01 strengejacke

good idea, though I already replied, I'll resubmit nonetheless

Jan 26 '21 14:01 DominiqueMakowski

re-re-submitted 🤞

Jan 26 '21 15:01 DominiqueMakowski

On CRAN

Jan 27 '21 00:01 DominiqueMakowski

bayestestR bayestestR copied to clipboard

rope range for linear models

1. The conditional sd will be smaller than the unconditional sd.

2. We have an inconsistency in the default rope range between models:

bayestestR
bayestestR copied to clipboard