velocyto.py icon indicating copy to clipboard operation
velocyto.py copied to clipboard

Using spatial transcriptomics data on Velocyto.py generates lower counts than Space Ranger

Open biyang-bioinfo opened this issue 3 months ago • 0 comments

Hi velocyto experts, I ran velocyto using 10x Visium V2 CytAssist (FFPE) spatial data and got the loom file, here is my CLI: velocyto run10x <my spaceranger output directory> /refdata_10x_st/refdata-gex-GRCh38-2020-A/genes/genes.gtf I found the total counts of adata.layers["spliced"] is 1070434 (from 292 genes), and the total counts of adata.layers["unspliced"] is 4727980 (from 1510 genes). The total counts of adata.X is 1070434 (from 292 genes). Those counts are much lower than the sum of space ranger ["spatial"]@ counts 181355517. This may not originate from the filtered out of multimapped or unmapped reads because the multimapped rate is 2.7% and the total mapping rate is 99.0%. Besides, only 0.7% reads were skipped because no appropriate cell or umi barcode was found. I think this is related to that the Visium V2 is a probe-based assay. Most of the probes target exonic regions and do not overlap splice junctions(the read length is only 50bp). Consequently, these probes would be less likely to detect pre-mRNA. However, this explanation doesn't entirely account for the significantly lower spliced counts and the detection of numerous unspliced reads. It raises questions about whether Velocyto is suitable for analyzing data generated from probe capture enrichment sequencing. I'm intrigued by the underlying principle used to differentiate between spliced and unspliced reads in this context. Looking forward to your reply!

Sincerely, Biyang

biyang-bioinfo avatar May 08 '24 03:05 biyang-bioinfo