diffxpy
diffxpy copied to clipboard
Fixing two_sample default noise_model
default noise_model in the two_sample helper function now matches the docstring and other testing methods
Thanks for the PR @dburkhardt! I put this as None because t-test complains if it gets a noise model that is not gaussian. This way, t-test can always be run easily and people have to choose (I considered this more advanced) if they use a wald test. Happy for feedback about this design choice! I could also change the docstring to make this clear.
Hmm okay, I think I'm seeing the issue here. So my guess is that for cases where you're comparing two samples of scRNA-seq, "nb" is the correct noise model for the Wald test.
If you're comparing two clusters, I think none of these tests will yield useful p-values because clustering introduces differences between partitions by design (https://linkinghub.elsevier.com/retrieve/pii/S2405471219302698).
What do you think here? If "nb" is the correct noise model for two_sample comparisons (i.e. comparisons of independently generated sets of cells), then why not have that set by default?
My take on the different tests is that they represent different assumptions on the data distribution and the necessity / way of inclusion of confounding variables. I agree with this
What do you think here? If "nb" is the correct noise model for two_sample comparisons (i.e. comparisons of independently generated sets of cells), then why not have that set by default?
But I would translate it to, if one choses a wald test, then "nb" is set as the default noise model. I can do that internally in the two_sample function. Then the default choise for noise_model is None which is ok for t-tests and which is changed to nb if wald test is chosen. Does match what your intuition?