diffxpy icon indicating copy to clipboard operation
diffxpy copied to clipboard

Results changed from v0.6.13 to 0.7.1

Open grst opened this issue 5 years ago • 2 comments

Hi,

while experimenting with diffxpy, I noticed that the results changed since I upgraded from v0.6.13 to v0.7.1. Is that intentional?

The setup:

I added diffxpy to the DE benchmark by Van den Berge 2019.

Under v0.6.13, diffxpy wald_test with nb noise-model produces results highly comparable to edgeR or a NB-model from python statsmodels:

True positive and false positive rate on simulated data at an FDR-cutoff of 0.05:

Method nDE TPR(%) FDR(%)
edgeR 78 7.1 9.0
diffxpy_wald 84 7.7 8.3
statsmodels_nb 85 7.7 9.4

However, under v0.7.1, the FDR is significantly inflated for diffxpy:

Method nDE TPR(%) FDR(%)
edgeR 78 7.1 9.0
diffxpy_wald 224 19.3 13.8
statsmodels_nb 85 7.7 9.4

Availability

The full analysis reports are available here:

The analysis is available at https://github.com/grst/benchmark-single-cell-de-analysis/. Everything is wrapped in a nextflow pipeline that uses conda envs. Simply running nextflow run ./benchmark.nf should reproduce the above reports.

grst avatar Nov 11 '19 12:11 grst

Thanks @grst for the great problem description, I am looking into this. This is probably linked to us changing the default backend to numpy-based optimizers.

davidsebfischer avatar Nov 11 '19 15:11 davidsebfischer

Btw, the example data is also available here as tsv files, it's probably easier for you than running the entire pipeline: https://github.com/grst/benchmark-single-cell-de-analysis/tree/master/diffxpy_test

Also, I have the impression that 0.7.1 runs significantly slower than 0.6.13. Is that something you can confirm?

grst avatar Nov 11 '19 15:11 grst