velocyto.py icon indicating copy to clipboard operation
velocyto.py copied to clipboard

Batch effects in combining multiple loom files

Open anshu957 opened this issue 1 year ago • 2 comments

I have samples are the spread across batches and I'm running velocyto for each batch (having its own cellranger folder) separately and finally combining all the loom files into one. When I tried to inspect the gene-wise counts (mean count in each gene) and cell-wise counts (mean count in each cell) compare it to cellranger's counts, I get the following results:

Mean counts per gene are higher in velocyto output as compared to cellRanger's count

image

Mean counts per cell maybe higher or lower depending on the batch image

Clearly, there is a batch effect here. Most approaches I've seen (from publicly shared repositories) just combined the loom files and merge it with their processed (and batch corrected) data.

Has anybody encountered this problem ? What could be the possible solution to address this before carrying out velocity analysis?

Ran the velocyto without repeat masks (if that makes any difference).

Versions

Chemistry : Single Cell 3' v2 Transcriptome: GRCh38-3.0.0 Cellranger : 3.1.0 velocyto: 0.17.17

anshu957 avatar Oct 05 '23 16:10 anshu957