diffxpy
diffxpy copied to clipboard
Results changed from v0.6.13 to 0.7.1
Hi,
while experimenting with diffxpy, I noticed that the results changed since I upgraded from v0.6.13 to v0.7.1. Is that intentional?
The setup:
I added diffxpy to the DE benchmark by Van den Berge 2019.
Under v0.6.13, diffxpy wald_test
with nb
noise-model produces results highly comparable to edgeR
or a NB-model from python statsmodels:
True positive and false positive rate on simulated data at an FDR-cutoff of 0.05:
Method | nDE | TPR(%) | FDR(%) |
---|---|---|---|
edgeR | 78 | 7.1 | 9.0 |
diffxpy_wald | 84 | 7.7 | 8.3 |
statsmodels_nb | 85 | 7.7 | 9.4 |
However, under v0.7.1, the FDR is significantly inflated for diffxpy
:
Method | nDE | TPR(%) | FDR(%) |
---|---|---|---|
edgeR | 78 | 7.1 | 9.0 |
diffxpy_wald | 224 | 19.3 | 13.8 |
statsmodels_nb | 85 | 7.7 | 9.4 |
Availability
The full analysis reports are available here:
The analysis is available at https://github.com/grst/benchmark-single-cell-de-analysis/.
Everything is wrapped in a nextflow pipeline that uses conda envs. Simply running nextflow run ./benchmark.nf
should reproduce the above reports.
Thanks @grst for the great problem description, I am looking into this. This is probably linked to us changing the default backend to numpy-based optimizers.
Btw, the example data is also available here as tsv
files, it's probably easier for you than running the entire pipeline: https://github.com/grst/benchmark-single-cell-de-analysis/tree/master/diffxpy_test
Also, I have the impression that 0.7.1 runs significantly slower than 0.6.13. Is that something you can confirm?