velocyto.py
velocyto.py copied to clipboard
run10x multiple sample
Hello, When I run run10x command on the 10X samples aggregated using cellranger aggr function, run10x gives me an error.
Traceback (most recent call last):
File "../anaconda3/bin/velocyto", line 11, in
Could you please let me know how to fix it?
If run10X is not supposed to run on cellranger aggregated files, how can I then aggregate the loom files that I generated on three samples separately?
Thank you.
Best Regards, Syed
I am having the same issue. It won't let me run with the advanced run function either with the aggregated file - I just get: Error: Invalid value for "BAMFILE..."
Hello,
Is there any update on the above issue? I'm actually getting the a very similar error to Syed, however I am not using a .bam file from a 10X genomics aggregated library.
Here is my code: velocyto run10x -m ~/gm/analysis/velocyto/gWATCLLinneg/mm10_rmsk.gtf ./bamfile ~/refdata-cellranger-mm10-1.2.0/genes/genes.gtf
And here is the error:
/wsu/apps/gnu-4.4.7/python/python-3.6.5/lib/python3.6/site-packages/sklearn/externals/joblib/externals/cloudpickle/cloudpickle.py:47: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
2018-11-16 14:47:20,320 - ERROR - This is an older version of cellranger, cannot check if the output are ready, make sure of this yourself
Traceback (most recent call last):
File "/wsu/apps/gnu-4.4.7/python/python-3.6.5/bin/velocyto", line 11, in
If someone could investigate this I would greatly appreciate it. I am very excited to use your program on our 10X single cell data :]
Cheers,
Rayanne
Hi, I am also using it on an aggregated cellranger output and the problem was that it actually does not find the barcodes.tsv file. In my case in the cellranger aggregate output the folder that run10x.py expects to be called 'filtered_gene_bc_matrices' in
os.path.normcase("outs/filtered_gene_bc_matrices/*/barcodes.tsv")
it is actually called 'filtered_gene_bc_matrices_mex'. So the easy way is to remove that _mex, although this could mess up other analysis, meaning you want to put it back. The elegant way is to create a linked folder with the right name (I did it using 'ln -s').
I will also add that after doing this everything run smoothly until the creation of the loom file, which is the very last step and I am having problems there (I actually found this question trying to find an answer to my problem).
The question is a bit old but I hope it might be of help to someone.
Best, Clara
@Clara22 thank you for your help! Thanks to your comment, and a review of all the other issues for this module, I have managed to generate a loom file for my 10X data. @galib36 @LineWulff I hope you have the program up and running now. If anyone from this thread has any questions, I would be happy to discuss how I got velocyto working for my data.
How did you manage @RBBurl1227 ?
Hello, all I am meeting the same issue of the above problem when I am trying to run velocyto.
2019-03-26 10:54:40,070 - ERROR - The outputs are not ready
Traceback (most recent call last):
File "/storage/Software/packages/anaconda3/bin/velocyto", line 11, in
Does anybody fix it or make it run from 10x output? Thank you Hui
Sorry for the late reply, I think this is related to a different folder organization of a later/eralier version of cell ranger. I recommend using the run
command and explicitly passing the parameters.
run10x might have to be updated.
The data I am using is 10X data. I have now successfully made one loom file using the command run10x from count data generated from every version of cellranger (v1.3.1, v2.0.1, v3.0.0). I was able to create a loom file after I changed the following things:
-loading samtools along with velocyto.py -changing the name of the .bam file from samplename_possorted_genome_bam.bam to possorted_genome_bam.bam. -using my 10x cellranger count directory as the directory in my code. Previously, I was just directing the program to take the possorted_genome_bam.bam file.
Here is the code that has been successful for me:
module load python velocyto module load samtools cd name/data/counts/tissue/ velocyto run10x -m ~/name/analysis/velocyto/experiment/mm10_rmsk.gtf ./sampledirectory ~/refdata-cellranger-mm10-3.0.0/genes/genes.gtf
I make sure I have the velocyto and samtools modules loaded. I set my working directory to the directory where my 10x cellranger count output directory is. Then I run the velocyto run10x command, directing it to my repeat masker file, my genes file, and my 10x cellranger count output directory -> ./sampledirectory.
I hope this is helpful. If this doesn't make sense to someone, feel free to tag me in this thread and I will clarify.
The run command as suggested by @gioelelm worked for me
Hi there, I wonder if I could ask - I am having the same problem. It runs fine on single sample, with barcodes.tsv.gz file in ./outs/filtered_feature_bc_matrix/ However when I switch to output of CellRanger Aggr it produces the error SAMPLE FILE not found. The directory also has barcodes.tsv.gz in ./outs/filtered_feature_bc_matrix. Many thanks.
@RBBurl1227 your ./sampledirectory is the directory contains the "outs" folder, or the outs directory which contains the bam file (i.e. /outs )?
From the tutorial, it says it should be the directory contains the outs folder:
"velocyto includes a shortcut to run the counting directly on one or more cellranger output folders (e.g. this is the folder containing the subfolder: outs, outs/analys and outs/filtered_gene_bc_matrices)."
I tried various things (#320), but still not managed to get it worked :(