souporcell
souporcell copied to clipboard
Bug when only 1 cluster in data?
Hi, first: thanks for the singularity-image, a great way to provide scientific tools!
Now to the issue: We tried the 99-1 ratio mix of two donors (B and C) from 10X / Zhang et al 2017 ("Massively parallel digital transcriptional profiling of single cells") with Souporcell. Unfortunately, we got the following error:
imports done
checking bam for expected tags
checking fasta
restarting pipeline in existing directory 10X2017data/rundir_99-1
running souporcell clustering
Traceback (most recent call last):
File "/opt/souporcell/souporcell_pipeline.py", line 446, in <module>
souporcell(args, ref_mtx, alt_mtx)
File "/opt/souporcell/souporcell_pipeline.py", line 387, in souporcell
"-t", str(args.threads), "-l", args.max_loci, "--min_alt", args.min_alt, "--min_ref", args.min_ref,'--out',cluster_file],stdout=log,stderr=log)
File "/opt/conda/envs/py36/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['souporcell.py', '-a', '10X2017data/rundir_99-1/alt.mtx', '-r', '10X2017data/rundir_99-1/ref.mtx', '-b', '10X2017data/99-1/outs/filtered_feature_bc_matrix/barcodes.tsv', '-k', '2', '--restarts', '15', '-t', '8', '-l', '2048', '--min_alt', '4', '--min_ref', '4', '--out', '10X2017data/rundir_99-1/clusters_tmp.tsv']' returned non-zero exit status 1.
The 50-50 and 90-10 mixes from the same paper run smoothly. As a side note, when we tried "-k 1" on our data (to get the loss value with only one assumed donor), we also got an error (I did not write it down). Is there maybe some issue if the data or the parameters show souporcell only 1 cluster?
Best Julian
Thanks. I have not thought about the one cluster case. I will look into that.
For the error, could you post what it says in the souporcell.log file in the output directory?
sourporcell_ownData_1cluster.txt
Hey, sorry for the late reply, here is the output for the 2017-10X-99:1 data:
-> first attachement
For the case with "-k 1" for our own data, it seems different: Sourporcell.log ends like "usual"/with k=2, just with only 1 output-dim:
-> second attachement However, the 3 files usually created after "troublet.done" (namely ambient_rna.txt, cluster_genotypes.vcf, consensus.done") are missing, I guess it crashed there. So seems to not be the same problem.
The problem from the first file is a bug I already fixed but have not put in the release version yet.
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1LWKHs-nKXNH5vVkP8iJAgLquiE9FLMb5' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1LWKHs-nKXNH5vVkP8iJAgLquiE9FLMb5" -O souporcell.sif && rm -rf /tmp/cookies.txt
grab this container and it should fix it