velocyto.py icon indicating copy to clipboard operation
velocyto.py copied to clipboard

Error in SAMPLEFOLDER

Open skoturan opened this issue 1 year ago • 12 comments

Hi, I have been trying to run velocyto run10x, but I keep getting this error for my sample folder:

Error: Invalid value for 'SAMPLEFOLDER': Directory '~rds/.../filtered_feature_bc_matrix' does not exist.

Could you please tell me the files this command requires in the 10x output folder? barcodes.tsv, matrix.tsv and features.tsv ? Thanks

skoturan avatar Aug 01 '22 11:08 skoturan

velocyto run10x requires the Cell Ranger outputs like below image use codes like below

velocyto run10x -@ 16 -m /mnt/d/KP/hg38_rmsk.gtf /home/hyjforesight/SMC05T/ /mnt/d/KP/refdata-gex-GRCh38-2020-A-modified/genes/genes.gtf

hyjforesight avatar Aug 03 '22 21:08 hyjforesight

Hi hyjforesight, I have been getting this error as well. I got it only when I run on cellranger output of cell ranger multi.

I assume this is because the outs folder structure is different where the sample_alignment.bam and bam.bai will be found in a subfolder in the following path instead of being directly in outs 'sample1/outs/per_sample_outs/sample1/ '

Is that correct?. Could you please advise?. I tried to even put the file path as 'sample1/outs/per_sample_outs/sample1/' instead of sample1 only but it also didn't work. Thanks

Sa753 avatar Oct 23 '22 17:10 Sa753

hello @Sa753 we also did multiplexed scRNA-seq. As I remember, velocyto run10x didn't work for cell ranger multi. you should use velocyto (on any technique) for individual bam image

hyjforesight avatar Oct 23 '22 19:10 hyjforesight

Hi hyjforesight,

Thanks so much for prompt reply. I will have to run cell ranger on individual samples then do Velocyto. is correct?. Thanks

Sa753 avatar Oct 23 '22 20:10 Sa753

hello @Sa753 if your samples were prepared by 10X cell plex kit, you will get 2 libraries after sequencing. One is oligo, the other is sequencing data (Sorry, I don't remember exactly the names). Then you should run Cell Ranger Multi for demultiplexing these sequencing data. The Cell Ranger Multi output includes folders for individual samples, where bam file is there. Use these bam files and run velocyto (on any technique) one by one. velocyto run10x is only for samples prepared by 10X v2 or v3 kit.

hyjforesight avatar Oct 23 '22 20:10 hyjforesight

Hi hyjforesight,

Yes, cell ranger multi produces bam files for each of the components of cell ranger multi and there are separate bam file for each but it is still not accepting this bam file. There is another thing, are you sure that Velocyto run10x is only for v2 or v3 kits?. Velocyto is established long before v2 or v3 kits were made and I run it on samples prepared by v1?. Thanks

Sa753 avatar Oct 23 '22 21:10 Sa753

hello @Sa753 try this

# sort individual bam first
samtools sort /home/hyjforesight/sorting/Apc_Tumor.bam -o /home/hyjforesight/sorting/possorted_Apc_Tumor.bam -@ 16
# use the 10X barcodes.tsv and 10X reference genome file
velocyto run -@ 16 -b /home/hyjforesight/barcodes.tsv -o /home/hyjforesight/loom/ -m /home/hyjforesight/mm10_rmsk.gtf /home/hyjforesight/possorted_Apc_Tumor.bam /home/hyjforesight/refdata-gex-mm10-2020-A/genes/genes.gtf

Velocyto run10x is also suitable for v1.

hyjforesight avatar Oct 24 '22 03:10 hyjforesight

Hi hyjforesight,

You pointed the right thing in the error which is that it always can't find the barcodes.tsv. I will try and run this code and update you. however, I just want to point that in the cellranger multi, the bam file is not called 'possorted.bam' it is called is made into assigned.bam in the per_sample_outs/counts folder which has the filtered reads assigned to cells and unassigned.bam in the multi folder/counts that contain the raw reads. so I think I will use the path to the bam in the filtered counts not the raw.

Thanks

Sa753 avatar Oct 24 '22 09:10 Sa753

Hello @Sa753 Please use the assigned.bam. As I remember, I renamed this bam into the format of possorted_XXX.bam, so that velocyto doesn't resort it again. And unzip the barcodes file generated by Cell Ranger Multi. Velocyto needs it to match the real cell numbers.

hyjforesight avatar Oct 24 '22 18:10 hyjforesight

Hi hyjforesight,

Can I just clarify if the possorted_genome.bam that is produced from cellranger (not cell ranger multi) needs sorting or not?. Velocyto 10x run always sort it and the run takes around 10h and large amount of RAM but in the previous reply you said that Velocyto shouldn't resort it again?. Am I missing something here?. Also, why should I unzip the barcode file from cellranger multi when Velcyto runs on the zipped barcode files from cell ranger without the need to unzip it? Thanks

Sa753 avatar Oct 24 '22 21:10 Sa753

Hi hyjforesight,

I tried the above code and it is not working.. It can't find the barcodes.tsv file and when I added -b path to barcodes.tsv file . the error log was that it didn't understand the argument -b

again just to say, cell ranger multi doesn't produced possorted_XXX.bam.

Thanks

Sa753 avatar Oct 25 '22 20:10 Sa753

@Sa753 Could you please show me your folder contents in per_sample_outs folder? This is the example of our Cell Ranger outputs, but I don't remember the contents in subfolder of CKP. image

hyjforesight avatar Oct 25 '22 21:10 hyjforesight