dabestr icon indicating copy to clipboard operation
dabestr copied to clipboard

Median difference vs. difference in medians for the paired plot

Open Generalized opened this issue 3 years ago • 1 comments

When we work with paired data, it's usually about median difference, not the difference in medians. Mean change is equal to change in means, but this doesn't hold for medians in general.

> set.seed(100)
> a <- rnorm(100, mean = 1)
> b <- rnorm(100, mean=10)
> median(a-b)
[1] -8.987909
> median(a) - median(b)
[1] -8.986542
# Close, but not exactly equal (finite sampling)

> set.seed(100)
> a <- runif(100, 0, 10)
> b <- runif(100, 10, 20)
> median(a-b)
[1] -9.998856
> median(a) - median(b)
[1] -9.968535
# Close, but not exactly equal

> set.seed(100)
> a <- rlnorm(100)
> b <- rlnorm(100, meanlog = 1)
> median(a) - median(b)
[1] -1.585291
> median(a-b)
[1] -1.586252
# very close

> set.seed(100)
> a <- rlnorm(100)
> b <- rlnorm(100, meanlog = 1, sdlog = 3)
> median(a-b)
[1] -1.275271
> median(a) - median(b)
[1] -1.245357
#quite close

> set.seed(100)
> a <- rbeta(100, 10, 1)
> b <- rbeta(100, 1, 10)
> median(a)-median(b)
[1] 0.8524503
> median(a-b)
[1] 0.8345763
# quite close

But

> set.seed(100)
> a <- runif(100, 0, 10)
> b <- rnorm(100, 5, 1)
> median(a-b)
[1] -0.04137754
> median(a) - median(b)
[1] 0.1726379

> set.seed(100)
> a <- rnorm(100, 1, 1)
> b <- rlnorm(100, 0, 1)
>median(a-b); 
[1] -0.1147998
>median(a)-median(b)
[1] 0.0107118

> set.seed(100); a <- runif(100, -1, 3); b <- rlnorm(100, 0, 2); median(a-b); median(a)-median(b)
[1] -0.2062541
[1] 0.2443354

> set.seed(100); a <- c(runif(50, -4, -1), rnorm(50, 2, 4)); b <- rnorm(100, 0, 4); median(a-b); median(a)-median(b);
[1] 0.2922939
[1] -1.24893

and so on.

Another example: http://onbiostatistics.blogspot.com/2015/12/median-of-differences-versus-difference.html

Actually, I have never seen difference in medians being reported for paired data. It was rather Hodges–Lehmann estimator of the pseudo-media, approached by the median change when the distribution of changes is symmetric, or the relative effect.

I looked at the code and it says it's about difference in medians: median(treatment) - median(control)

Would you consider adding median difference too?

Generalized avatar May 21 '21 20:05 Generalized

Many thanks for the note, we'll aim to correct this.

I hope the summer interns can get to it by August. But if you can see a way to do it yourself, please let us know—and send a pull request when ready.

adamcc avatar May 22 '21 05:05 adamcc