scikit-posthocs icon indicating copy to clipboard operation
scikit-posthocs copied to clipboard

Question: Post-hoc dunn return non-significant

Open darrencl opened this issue 3 years ago • 3 comments

Hi! Thanks a lot for creating this analysis tool.

I would like to check if it is normal that a post-hoc analysis using Dunn test, after Kruskal-Wallis, returns no significant result at all between the pairwise comparisons?

Another question, does Dunn test require multiple comparison correction? Either way (with or without correction), I don't get any significant even though Kruskal-Wallis test rejects the null hypothesis.

darrencl avatar Mar 26 '21 05:03 darrencl

Hey! I think this may be due to insufficient statistical power of Dunn's test. You can try Conover's test instead. I will look for the relevant research on this. And yes, Dunn's and Conover's tests require p values correction.

maximtrp avatar Mar 27 '21 09:03 maximtrp

@maximtrp Thanks for your reply! I just tried Conover's test and it is still the same. In fact, the corrected pairwise p-values are actually higher with Conover's test.

I followed this with Bonferroni-Holm correction (p_adjust='holm')

darrencl avatar Mar 30 '21 00:03 darrencl

Hi,

I met the same situation recently. Below is the example from [1] regarding four algorithms over 14 datasets.

# %%
import numpy as np
data = [
    [0.763, 0.768, 0.771, 0.798],
    [0.599, 0.591, 0.590, 0.569],
    [0.954, 0.971, 0.968, 0.967],
    [0.628, 0.661, 0.654, 0.657],
    [0.882, 0.888, 0.886, 0.898],
    [0.936, 0.931, 0.916, 0.931],
    [0.661, 0.668, 0.609, 0.685],
    [0.583, 0.583, 0.563, 0.625],
    [0.775, 0.838, 0.866, 0.875],
    [1.000, 1.000, 1.000, 1.000],
    [0.940, 0.962, 0.965, 0.962],
    [0.619, 0.666, 0.614, 0.669],
    [0.972, 0.981, 0.975, 0.975],
    [0.957, 0.978, 0.946, 0.970]
]
data = np.array(data)


# %%
import scikit_posthocs as sp
sp.posthoc_dunn(data.T, p_adjust='bonferroni')

But it returns meaningless results:

  1 2 3 4
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0

I observed that the implementation of posthoc_dunn ranks the entire data matrix, while [1] is row-wise. Did this make any difference ?

Thanks a lot !

[1] Dem\check{s}ar, J. Statistical comparisons of classifiers over multiple data sets. JMLR, 2006.

MTandHJ avatar May 17 '22 15:05 MTandHJ

I have checked the algorithm and found no errors. Dunn suggests ranking all data in the original paper.

maximtrp avatar May 08 '23 08:05 maximtrp