statsmodels icon indicating copy to clipboard operation
statsmodels copied to clipboard

Tukey HSD posthoc test for group interactions in two-way ANOVA

Open reneshbedre opened this issue 5 years ago • 7 comments

Hi,

I would like to know how to perform a pairwise comparison for group interactions using Tukey HSD. I tried to use pairwise_tukeyhsd from statsmodels.stats.multicomp, but it does not support pairwise comparison for group interactions.

In R it can be done like this (see https://rpubs.com/tmcurley/twowayanova)

TukeyHSD(len.aov, which = "supp:dose")

Let me know if there is a solution to achieve this. If not, it will be a good addition to the statsmodels package.

reneshbedre avatar Jun 12 '20 00:06 reneshbedre

statsmodels only has oneway pairwise comparisons.

I never looked at references for two- or multiway pairwise comparisons

One possibility is to recode two way comparison to be a oneway comparison for all crossed cells. Based on a very brief look, this might be what TukeyHSD(len.aov, which = "supp:dose") is doing in your link. I don't know what the best way is to merge two categoricals into a single categorical for all crossed cells using pandas. I have some code doing it the plain numpy way.

josef-pkt avatar Jun 12 '20 04:06 josef-pkt

Okay, thank you very much for your reply. Can you please show some example of how to do that?

Suppose I have this model model = ols('value ~ C(Genotype) + C(years) + C(Genotype):C(years)', data=df).fit()

df.head() Genotype years value 0 A 1_year 1.53 1 A 1_year 1.83 2 A 1_year 1.38 3 B 1_year 3.60 4 B 1_year 2.94

How can I analyze TukeyHSD for the interaction?

for individual factor, I can do like this pairwise_tukeyhsd(endog=df['value'], groups=df['Genotype'], alpha=0.05)` pairwise_tukeyhsd(endog=df['value'], groups=df['years'], alpha=0.05)

Thank you.

reneshbedre avatar Jun 14 '20 15:06 reneshbedre

Any news about it?

I would also want to apply the pairwise_tukeyhsd on a 2-way anova test.

seralouk avatar Apr 20 '22 13:04 seralouk

Okay, thank you very much for your reply. Can you please show some example of how to do that?

Suppose I have this model model = ols('value ~ C(Genotype) + C(years) + C(Genotype):C(years)', data=df).fit()

df.head() Genotype years value 0 A 1_year 1.53 1 A 1_year 1.83 2 A 1_year 1.38 3 B 1_year 3.60 4 B 1_year 2.94

How can I analyze TukeyHSD for the interaction?

for individual factor, I can do like this pairwise_tukeyhsd(endog=df['value'], groups=df['Genotype'], alpha=0.05)` pairwise_tukeyhsd(endog=df['value'], groups=df['years'], alpha=0.05)

Thank you.

I believe the last 2 code lines are fine and can be used to test how Genotype and Years affect the Value. Haven't found a better way to do this either.

seralouk avatar Apr 20 '22 13:04 seralouk

I want to run tukey HSD on three factors, each with 12 groups (THREE way anova,) has there been any update that does it now?

MuhammadIKaleem avatar Apr 16 '23 07:04 MuhammadIKaleem

No, I still have not figured out what that is theoretically supposed to do.

The only way I looked at two-way interaction is to define each cell as a group unit/level, so there is only a single (combined) group.

models have a wald_pairwise option but it's also only for one factor and not for multi-factor effect.

josef-pkt avatar Apr 16 '23 13:04 josef-pkt

Hello! If I am not mistaken the MultiComparison allows tukey test of several factors easily

import statsmodels.stats.multicomp as mc

tukey=mc.MultiComparison(df['value'],df['Genotype']+df['years']) tukey_test=tuk.tukeyhsd()

I believe this will provide you with the solution

Samyaktg avatar Mar 02 '24 12:03 Samyaktg