statsmodels icon indicating copy to clipboard operation
statsmodels copied to clipboard

ENH: tukeyhsd, pairwise comparison from summary statistics

Open josef-pkt opened this issue 6 years ago • 3 comments

https://stackoverflow.com/questions/50070927/anova-table-and-pairwise-test

tukey-hsd based on mean and var/std as sufficient statistics.

I have not looked at it, but this should be made possible and easy to add, for one way anova

related #4379 tukey-hsd improvements #4323 discussion of two way anova

josef-pkt avatar Apr 29 '18 22:04 josef-pkt

is anyone working on it ?

sadepu1915 avatar Jun 30 '19 23:06 sadepu1915

Just in case anyone is in need of a quick hack, this worked for me. Output is a tuple instead of TukeyHSDResults. For some reason putting the vector of variances directly into tukeyhsd() gave crazy results, only works when var_all is a scalar.

import numpy as np

def tukey_from_summary(means,stdevs,ns,alpha=.05):
    from statsmodels.sandbox.stats.multicomp import tukeyhsd

    variances = stdevs**2
    pooled_var = sum(np.multiply(variances,(ns-1)))/sum(ns-1)
    res = tukeyhsd(mean_all=means, nobs_all=ns, var_all=pooled_var, alpha=alpha, q_crit=None)
    
    return res

g-simmons avatar Jul 24 '19 04:07 g-simmons

putting the vector of variances directly into tukeyhsd() gave crazy results, only works when var_all is a scalar.

sounds like a bug There were some bugs in my initial var correction functions. Maybe they have not all been fixed.

josef-pkt avatar Sep 05 '22 19:09 josef-pkt

check during refactoring #8396

helper functions like var computation need more unit tests, and possibly removal of code duplication.

josef-pkt avatar Dec 05 '22 14:12 josef-pkt