interpreting significance in DMRs
This has been somewhat asked before but not fully... I want to know how I should interpret the cohen's h statistic.
There isn't much to gather online beyond: h = .2: small effect, h = .5: medium effect, h = .8: large effect
I can see the max value is pi (or negative pi)
Does modkit DMR output absolute values of cohen's h or is it directional i.e. is cohen's h -0.8 comparable to a cohen's h of 0.8 in the case of decreased vs increased methylation?
If the latter is true, how then can you interpret and filter from lower and upper CIs for a negative cohen's h value?
Hello @eesiribloom,
The Cohen's h metric is directional, but the CIs are absolute value. You can see this in the test here actually, reproduced and annotated below:
fn test_cohen_h_signed() {
let n = 100;
// the signature for this function is fn calc_cohen_h(p1: f64, p2: f64, n1: usize, n2: usize) -> CohenHResult
// p1 and p2 are the frequencies for the two conditions and n1 and n2 are the sample sizes.
// in the first case we have a negative difference in means (0.1 - 0.5 = 0.4) and the Cohen's h metric is negative
let res1 = calc_cohen_h(0.1, 0.5, n, n); // Cohen's h = -0.9272952180016124
// alternatively when the difference is positive, Cohen's h is positive
let res2 = calc_cohen_h(0.5, 0.1, n, n); // Cohen's h = 0.9272952180016124
// this is tested here
assert_eq!(res1.h, res2.h.neg());
// but also see that the CI values are the same
assert_eq!(res1.h_low, res2.h_low);
assert_eq!(res1.h_high, res2.h_high);
}
One thing you're bringing my attention to (thank you!) is that the directionality is probably opposite to what people expect. I think most people would expect that the difference is treatment - control, whereas Modkit currently does control - treatment. I'm doing a bit of a re-think on the DMR metrics right now, but in the mean time I've attached a build where it performs treatment - control.