velocyto.py
velocyto.py copied to clipboard
ERROR - Can not locate the barcodes.tsv file! using cellranger multiplexing (cellranger v6.1)
Hi,
I used the 'cellranger multi' command (cellranger v6.1.1) to process my data. I am now trying to use velocyty (v0.17.17) to ascertain the counts. Is there any guidance on how to use the velocyto command with cellranger multi output files?
My command is:
velocyto run10x -@ 20 -@ 6000 -m ${REFDIR}/GRCh38_rmsk.gtf "/home/bs/test/cellranger_output_GEM1_08/outs/per_sample_outs/d60-1/count" ${REFDIR}/genes.gtf
I get an error:
2022-02-22 08:21:04,198 - ERROR - This is an older version of cellranger, cannot check if the output are ready, make sure of this yourself 2022-02-22 08:21:04,201 - ERROR - Can not locate the barcodes.tsv file!
The cellranger multiplexing has a different directory structure for output. For the data for the velocyto command above, the directory and files look something like this:
`(base) hpc:[cellranger_output_GEM1_08] % ls cellranger_output_GEM1_08.mri.tgz _filelist _invocation _log outs _perf.truncated _sitecheck _timestamp _vdrkill _cmdline _finalstate _jobmode _mrosource _perf SC_MULTI_CS _tags _uuid _versions
(base) hpc:[cellranger_output_GEM1_08] % ls outs config.csv multi per_sample_outs
(base) hpc:[cellranger_output_GEM1_08] % ls outs/per_sample_outs/ d60-1 d60-2 d90-1 d90-2
(base) hpc:[cellranger_output_GEM1_08] % ls outs/per_sample_outs/d60-1 1,1 All count metrics_summary.csv web_summary.html
(base) hpc:[cellranger_output_GEM1_08] % ls outs/per_sample_outs/d60-1/count analysis feature_reference.csv sample_alignments.bam.bai sample_feature_bc_matrix sample_molecule_info.h5 cloupe.cloupe sample_alignments.bam sample_barcodes.csv sample_feature_bc_matrix.h5`
The outs/multi directory:
`(base) hpc:[outs] % ls multi count multiplexing_analysis
(base) hpc:[outs] % ls multi/count/ feature_reference.csv raw_cloupe.cloupe raw_feature_bc_matrix raw_feature_bc_matrix.h5 raw_molecule_info.h5 unassigned_alignments.bam unassigned_alignments.bam.bai`
Thanks!
hey , i solved my problem. Rather than use multi/output, i use the multi output for each samples as multi/output. does not have the bam.bam file that velocyto need. and the second point is the file name which i think it is more clear if i just show you the code.
bcmatches = glob.glob(os.path.join(samplefolder, os.path.normcase("outs/filtered_gene_bc_matrices/*/barcodes.tsv"))) if len(bcmatches) == 0: bcmatches = glob.glob(os.path.join(samplefolder, os.path.normcase("outs/filtered_feature_bc_matrix/barcodes.tsv.gz"))) if len(bcmatches) == 0:
so by using output of each sample and by changing the file name i can run it sucessfully now.
hi, I am getting same error - I have only seurat object and i did writeMM(data1@assays$RNA@counts, file = paste0(dir, "output/matrix.mtx")) write(x = rownames(dataIntegrated@assays$RNA@counts), file = paste0(dir, "output/features.tsv")) write(x = colnames(dataIntegrated@assays$RNA@counts), file = paste0(dir, "output/barcodes.tsv") by this, I have all the cellranger ouput,masked gtf and ref gtf.
command i Used velocyto run10x -m "run_velocity/mm10_rmsk.gtf" "run_velocity/cell_ranger_output/" run_velocity/mm10_2020_A.gtf
ERROR - This is an older version of cellranger, cannot check if the output are ready, make sure of this yourself ERROR - Can not locate the barcodes.tsv file! bcfile = bcmatches[0] IndexError: list index out of range
Thanks!
Please considering "velocyto run" command rather than run10X. Here is an example:
velocyto run -b ./sample_filtered_feature_bc_matrix/barcodes.tsv.gz -o ./velocyto ./sample_alignments.bam path/genes.gtf
I think this is due to different folder naming: filtered_feature_bc_matrix vs sample_filtered_feature_bc_matrix. The former is expected for velocyto.
Hope this could help you.