maftools icon indicating copy to clipboard operation
maftools copied to clipboard

mafCompare may yield biased results when comparing cohorts with unequal sample sizes due to absolute mutation count-based Fisher test

Open fengqlin opened this issue 7 months ago • 1 comments

The current maftools::mafCompare() function performs Fisher's exact test using raw mutation counts (e.g., MutatedSamples) without accounting for cohort sample size disparities. This could lead to misleading statistical significance when:

Cohort sizes differ substantially (e.g., M1=100 vs M2=1000).

Mutation frequencies are similar but absolute counts diverge (e.g., 10/100 vs 100/1000).

May incorrectly report significant differences due to large absolute count differences.

While mathematically valid, this approach emphasizes absolute counts over rates, which is counterintuitive for frequency comparisons.

Replacing raw mutation counts with mutation proportions (e.g., (m1Mut/m1.sampleSize)*100) for Fisher's exact test would significantly enhance the validity of cross-cohort comparisons.

fengqlin avatar May 17 '25 08:05 fengqlin

Hi,

It does account for the sample size. It's a 2x2 contingency matrix of mutated and non-mutated values (cohort size - mutated cases) from two cohorts.

PoisonAlien avatar May 17 '25 09:05 PoisonAlien