pingouin icon indicating copy to clipboard operation
pingouin copied to clipboard

Return Confidence Interval for nonparametric Mann Whitney U Test

Open kschuerholt opened this issue 2 years ago • 5 comments

The t-test returns amongst other useful values the confidence interval on the difference between the means. A CI on the difference of medians would be super useful to have for nonparametric tests like the MWU, so that not everybody has to comb through literature to figure out how to compute CIs for nonparametric tests.

I'm not entirely sure if for generic cases that'd require bootstrapping, or if closed-form solutions exist and robust enough. A method to compute CIs for nonparametric tests is, i.e., given in Calculating confidence intervals for some non-parametric analyses, Michael J Campbell and Martin J Gardner, British Medical Journal 1988.

kschuerholt avatar Jan 21 '22 12:01 kschuerholt

Hi @kschuerholt,

Thank you for opening the issue and submitting a PR. I'll dive into the latter in the next few days.

This is related to https://github.com/raphaelvallat/pingouin/issues/153.

Thanks, Raphael

raphaelvallat avatar Jan 22 '22 02:01 raphaelvallat

Hi @raphaelvallat

Thans for your great work, glad to be able to contribute in a small way.

#153 seems to be the same feature request for wilcoxon. The paper I used for the PR also gives CI interavals for wilcoxon, the computation is very similar to the one for mwu. It does look different to the CI computation in R, but I'm no statistician.

Best, Konstantin

kschuerholt avatar Jan 22 '22 09:01 kschuerholt

Thank you @kschuerholt! Looking at the documentation of the wilcox.test R function, it seems that they are using the following formula: Myles Hollander and Douglas A. Wolfe (1973). Nonparametric Statistical Methods. New York: John Wiley & Sons. Pages 27--33 (one-sample), 68--75 (two-sample).

Optionally (if argument conf.int is true), a nonparametric confidence interval and an estimator for the pseudomedian (one-sample case) or for the difference of the location parameters x-y is computed. (The pseudomedian of a distribution (F) is the median of the distribution of ((u+v)/2), where (u) and (v) are independent, each with distribution (F). If (F) is symmetric, then the pseudomedian and median coincide. See Hollander & Wolfe (1973), page 34.) Note that in the two-sample case the estimator for the difference in location parameters does not estimate the difference in medians (a common misconception) but rather the median of the difference between a sample from x and a sample from y.

That said, the paper that you have used for the MWU test is more recent than the paper they refer to, and I think it would make sense to use the formula they provide to implement CI for the wilcoxon test as well. Is this something you would have time and bandwidth to implement?

A few other comments on the PR:

  1. The CI should be rounded and not displayed in full float precision, i.e. [-0.39, -0.09] instead of [-0.39290395101879694, -0.09400270319896187]. This should normally be done automatically by the _postprocess_dataframe function, which should round the CI95% column to two decimals.

  2. Do you know of any other implementations (R, Matlab, SPSS) of this CI method? If so, it would be great to add the CI to the unit testing of the MWU function, i.e. comparing our results against another statistical software.

  3. Could you make sure that the code follows the contributing guidelines? The code should be flake8-compatible. For instance, there must be white spaces between arithmetic operators here:

k = int(round(ct1*ct2/2 - (N * (ct1*ct2*(ct1+ct2+1)/12)**0.5)))

Thank you so much for your help on this, Raphael

raphaelvallat avatar Jan 25 '22 22:01 raphaelvallat

Hi @raphaelvallat

I can't promise an ETA, but I can implement the corresponding CI method for wilcoxon in the next days or weeks.

I checked the source again. The paper I cited earlier is basically a user's reference, it's only cited 3 times. They in turn appear to take the CI computation method from Conover WJ. Practical non-parametric statistics. New York: Wiley, 1980. That appears a more reputable source with more than 20000 citations, but I couldn't get hold of a copy, yet. I'll see what I can do on that front. I'm not familiar with the related work, so I can't make a call on which is the better method to use.

Regarding the other comments:

  1. At least locally, _postprocess_dataframe does give me the raw floats. Similar behavior for ttest e.g. w/ confidence=0.98. I'm not sure where you'd like to adress that.
  2. I had a look, but as far as I could see, Matlab doesn't comput CI, SPSS computes the CI on the p value and R uses - as you mentioned above, another method... :/ Maybe the literature holds examples that can be used for unit testing.
  3. Sure thing, sorry about that, will be considered in a new commit.

Cheers, Konstantin

kschuerholt avatar Jan 26 '22 17:01 kschuerholt

Hi @kschuerholt,

Thank you! That would be great if you could have a look at the wilcoxon CI, but no pressure at all. I am already very thankful for your contribution.

I was thinking that since there does not seem to be a single gold-standard method, we could also smply report the bootstrapped confidence intervals, using either scipy.stats.bootstrap or pingouin's own pg.compute_bootci function. However, this would drastically increase computation time, so if we use this we would need to allow the users to disable the CI though (e.g. by setting n_boot=0). Do you prefer the analytical or bootstrap method?

Also, please don't worry about the decimal rounding for now. I'll do a deep dive to fix this once the PR is ready.

Thanks, Raphael

raphaelvallat avatar Jan 28 '22 02:01 raphaelvallat