seurat icon indicating copy to clipboard operation
seurat copied to clipboard

DE analysis from merging many batches of data

Open arkyl opened this issue 6 months ago • 0 comments

Hi, Thanks a lot for your software!

We am planning to perform Differential Expression (DE) between different cell types, on many batches of count matrix data totaling >1M cells. As these were public data, the cell type assignment per cell was already given. Previously, we had done similar analysis on a single set of data, simply using FindMarkers() after QC/Normalization. Now with many batches of data, the main question is whether it's feasible to run FindMarkers() or FindAllMarkers() on merged data, with limited size of RAM on our computer.

  1. does FindMarkers() work on merged "BPCells" object? (create a Seurat object per batch, write and load by BPCell, QC/Normalize per object, then merge all objects into one object by CreateSeuratObject, FindMarkers() on this single object, write the merged object by BPcell for later use)

  2. we are also interested in a way for meta-analysis of the results of FindMarkers(), e.g meta value of avgLog2FC and pval. This would greatly reduce the run time and ram usage as we can run FindMarkers() per batch in parallel, and then meta-analysis the summary stat of avgLog2FC and pval. We are thinking some simple variance-inverse meta method, but any suggestions would be appreciated. We notice the MetaDE R package, but it didn't seem to be able to function on summary stats of avgLog2FC.

Thanks a lot for your help with this!

Best, Yue

arkyl avatar Aug 02 '24 00:08 arkyl