pseudobulking + DE tutorial
See https://decoupler.readthedocs.io/en/latest/notebooks/scell/rna_psbk.html for a starting point, but some questions/TODOs remain:
- decoupler has some nice plotting functionality for
filter_samples,filter_by_propetc - do we want to keep that? Migrate toscanpy?
- We need to output some sort of
n_obsfromscanpy.aggregate.getto be able to to filter properly based on number of replicates. I will open an issue there - What should happen to the decoupler notebook and/or pseudobulk function? I like the API but I would assume at first pass that the
scanpyimplementation is a bit faster/more efficient (not sure though)
Is the idea to use decoupler for the computation or just scanpy?
Because if we’re using decoupler, this would be more a candidate for scverse-tutorials, no?
The idea is to show off scanpy's pseudobulking capability via get.aggregate (which is pretty optimized) and then do DESeq2 on the result. My point is more that the current decoupler one is very comprehensive and good save for the fact that (a) it isn't on the scanpy home page and (b) doesn't use our optimized pseudobulker
Ah, so this is about https://scverse.zulipchat.com/#narrow/channel/316218-repo-management/topic/scanpy.20vs.20decoupler.20pseudobulking/with/542229960
And from there I infer you mean PyDESeq? OK, makes sense to have that here as a showcase for the pseudobulking!