dabestr
dabestr copied to clipboard
Median difference vs. difference in medians for the paired plot
When we work with paired data, it's usually about median difference, not the difference in medians. Mean change is equal to change in means, but this doesn't hold for medians in general.
> set.seed(100)
> a <- rnorm(100, mean = 1)
> b <- rnorm(100, mean=10)
> median(a-b)
[1] -8.987909
> median(a) - median(b)
[1] -8.986542
# Close, but not exactly equal (finite sampling)
> set.seed(100)
> a <- runif(100, 0, 10)
> b <- runif(100, 10, 20)
> median(a-b)
[1] -9.998856
> median(a) - median(b)
[1] -9.968535
# Close, but not exactly equal
> set.seed(100)
> a <- rlnorm(100)
> b <- rlnorm(100, meanlog = 1)
> median(a) - median(b)
[1] -1.585291
> median(a-b)
[1] -1.586252
# very close
> set.seed(100)
> a <- rlnorm(100)
> b <- rlnorm(100, meanlog = 1, sdlog = 3)
> median(a-b)
[1] -1.275271
> median(a) - median(b)
[1] -1.245357
#quite close
> set.seed(100)
> a <- rbeta(100, 10, 1)
> b <- rbeta(100, 1, 10)
> median(a)-median(b)
[1] 0.8524503
> median(a-b)
[1] 0.8345763
# quite close
But
> set.seed(100)
> a <- runif(100, 0, 10)
> b <- rnorm(100, 5, 1)
> median(a-b)
[1] -0.04137754
> median(a) - median(b)
[1] 0.1726379
> set.seed(100)
> a <- rnorm(100, 1, 1)
> b <- rlnorm(100, 0, 1)
>median(a-b);
[1] -0.1147998
>median(a)-median(b)
[1] 0.0107118
> set.seed(100); a <- runif(100, -1, 3); b <- rlnorm(100, 0, 2); median(a-b); median(a)-median(b)
[1] -0.2062541
[1] 0.2443354
> set.seed(100); a <- c(runif(50, -4, -1), rnorm(50, 2, 4)); b <- rnorm(100, 0, 4); median(a-b); median(a)-median(b);
[1] 0.2922939
[1] -1.24893
and so on.
Another example: http://onbiostatistics.blogspot.com/2015/12/median-of-differences-versus-difference.html
Actually, I have never seen difference in medians being reported for paired data. It was rather Hodges–Lehmann estimator of the pseudo-media, approached by the median change when the distribution of changes is symmetric, or the relative effect.
I looked at the code and it says it's about difference in medians: median(treatment) - median(control)
Would you consider adding median difference too?
Many thanks for the note, we'll aim to correct this.
I hope the summer interns can get to it by August. But if you can see a way to do it yourself, please let us know—and send a pull request when ready.