effectsize
effectsize copied to clipboard
Use correction for small-sample bias in all Chisq effect size
For $\phi$, the small-sample bias corrected estimate is:
$$ \widetilde{\phi} = \sqrt{\phi^2 - \frac{df}{N-1}} $$
This comes from the non-central $\chi^2$ distribution, where $E[\hat{\chi^2}] = df + \phi ^2 \times N$ => $E[\hat{\phi^2}] = \phi ^2 + df / N$.
This is used in effectsize for:
- [x]
phi(adjust = TRUE) - [x]
cramers_v(adjust = TRUE) - [x]
tschuprows_t(adjust = TRUE)
(The latter two also have a weird scaling factor from Bergsma (2013).)
This correction can be applied to all $\phi$-like effect sizes:
- [ ]
cohens_w()- makes the most sense as it applies the same transformation on $\chi^2$ as $\phi$ does. - [ ]
pearsons_c()- can be seen as a transformed Cohen's w ( $C = \sqrt{W^2 / (W^2 - 1)}$ ) so using an adjusted w would "adjust" C as well. - [ ]
fei()- same reasoning. Although the additional scaling factor ( $1/min(p_E) - 1$ ) might have to be adjusted in a similar manner as V and T's is. (See next section.)
Some of my thoughts...
Bergsma (2013) suggested changing the scaling factors of V and T in such a way that when (the true) $T=1$, RMSE would be 0 because (regardless of sample size) the estimated T would also be 1.
To achieve this with פ:
$$ \widetilde{פ} = \sqrt{\frac{\widetilde{\phi^2}}{\frac{1}{min(p_E)} - 1 - \frac{k-1}{n-1}}} $$
I'm not sure this is the way to go, because it also means that a sample in which פ=1 will produce an estimate of 1, even when the sample size is arbitrarily small. For example:
O <- c(2, 0)
E <- c(0.35, 0.65)
res <- chisq.test(O, p = E, correct = FALSE)
chisq <- unname(res$statistic)
df <- unname(res$parameter)
N <- sum(O)
phi2_adj <- chisq / N - df / (N - 1)
# adjusted Fei
sqrt(phi2_adj /
(1 / min(E) - 1 - df / (N - 1)))
#> [1] 1
# unadjusted Fei
effectsize::fei(O, p = E, ci = NULL)
#> Fei
#> ----
#> 1.00
#>
#> - Adjusted for uniform expected probabilities.
This is also true for T (by design):
mat <- diag(2)
mat[1,1] <- 2
mat
#> [,1] [,2]
#> [1,] 2 0
#> [2,] 0 1
effectsize::tschuprows_t(mat, ci = NULL)
#> Tschuprow's T (adj.)
#> --------------------
#> 1.00
From what I can see, small sample bias adjustments almost always shrink the estimate, even when it is perfect (e.g., $R^2_{adj}$, $\omega^2$, $\epsilon^2$). So I think having:
$$ \widetilde{פ} = \sqrt{\frac{\widetilde{\phi^2}}{\frac{1}{min(p_E)} - 1}} $$
(which uses the regular scaling factor) makes the most sense to me, which will also make it consistent with w for the uniform-binary case, but will make it inconsistent with the adjusted V and T.