velocyto.py
velocyto.py copied to clipboard
Batch effects in combining multiple loom files
I have samples are the spread across batches and I'm running velocyto for each batch (having its own cellranger folder) separately and finally combining all the loom files into one. When I tried to inspect the gene-wise counts (mean count in each gene) and cell-wise counts (mean count in each cell) compare it to cellranger's counts, I get the following results:
Mean counts per gene are higher in velocyto output as compared to cellRanger's count
Mean counts per cell maybe higher or lower depending on the batch
Clearly, there is a batch effect here. Most approaches I've seen (from publicly shared repositories) just combined the loom files and merge it with their processed (and batch corrected) data.
Has anybody encountered this problem ? What could be the possible solution to address this before carrying out velocity analysis?
Ran the velocyto without repeat masks (if that makes any difference).
Versions
Chemistry : Single Cell 3' v2 Transcriptome: GRCh38-3.0.0 Cellranger : 3.1.0 velocyto: 0.17.17