velocyto.py
velocyto.py copied to clipboard
Issues running velocyto
Hi, I am trying to run velocyto using run10x command. I'm running on cluster by requesting 60 GB memory. Do I need more memory for that? Thank you so much for your help.
initial position.
2019-04-29 20:09:26,261 - DEBUG - 2745314 reads were skipped because no apropiate cell or umi barcode was found
2019-04-29 20:09:26,262 - INFO - Now just waiting that the bam sorting process terminates
Traceback (most recent call last):
File "/home/mrr2006/.conda/envs/velocyto/bin/velocyto", line 11, in sort -l [compression] -m [mb_to_use]M -t [tagname] -O BAM -@ [threads_to_use] -o cellsorted_[bamfile] [bamfile]
")
MemoryError: bam file #0 could not be sorted by cells.
This is probably related to an old version of samtools, please install samtools >= 1.6. In alternative this could be a memory error, try to set the --samtools_memory option to a value compatible with your system. Otherwise sort manually by samtools sort -l [compression] -m [mb_to_use]M -t [tagname] -O BAM -@ [threads_to_use] -o cellsorted_[bamfile] [bamfile]
Had same problem. Solved this way. Ran samtools first to get the cellsorted_possorted.genome_bam.bam file. And then ran velocyto. In the velocyto documentation, it says "If the file cellsorted_[ORIGINALBAMNAME] exists, the sorting procedure will be skipped and the file present will be used."
(base) u0119129@gbw-d-l0099:~$ samtools sort -t CB -O BAM -o/mnt/DATA2/RAW_DATA/Re_Run_\ 2018_Data_with_updated_reference_genome/KUL-1-1000cells/outs/cellsorted_possorted_genome_bam.bam /mnt/DATA2/RAW_DATA/Re_Run_\ 2018_Data_with_updated_reference_genome/KUL-1-1000cells/outs/possorted_genome_bam.bam
(base) u0119129@gbw-d-l0099:~$ velocyto run10x -m /mnt/DATA1/Velocyto/alltracks_mask.gtf /mnt/DATA2/RAW_DATA/Re_Run_\ 2018_Data_with_updated_reference_genome/KUL-1-1000cells /mnt/DATA1/Velocyto/refdata-cellranger-mm10-3.0.0/genes/genes.gtf
Hope it helps
I meet the same error. I am just wondering what's the meaning of "-t CB" in samtools sort here? Could you show me about this? Thanks,
Does CB meanings "Cell Barcodes" here?
@SiyiWanggou As I understand, CB is a tag in the bam file which contains the barcodes. If you want to sort by cell barcodes then you should set -t CB. Do not put the path to cell bacodes file here.
@kaizen89 Thanks. I get some information from 10X genomics website. It discibes CB as "Chromium cellular barcode sequence that is error-corrected and confirmed against a list of known-good barcode sequences". I think you are correct.
@kaizen89 @saeedfc I tried velocyto run -b barcodes.tsv -o ./velocyto_mcf10a_results -m repeat_msk.gtf cellsorted_possorted_genome_bam.bam hg38_ens94.chr.gtf
and I got an error saying that the cell and umis are not correctly formatted....Am I running this correctly?
@arjun0502 Hi, I'm also struggling with this. Running velocyto run after samtools, and it says the cell and umis are not correctly formatted. Have you managed to solve this? Thanks!
@saeedfc -- When you manually sort your bam first, did you then need to use the original bam.bai (index) file that came with the cellranger count? Or, did you have to index your cellsorted_possorted...bam file? If so, how did you index it? --- Issue #321
# like this? or any options?
samtools index sample.sorted.bam