souporcell icon indicating copy to clipboard operation
souporcell copied to clipboard

Bug when only 1 cluster in data?

Open JulianRein opened this issue 5 years ago • 3 comments

Hi, first: thanks for the singularity-image, a great way to provide scientific tools!

Now to the issue: We tried the 99-1 ratio mix of two donors (B and C) from 10X / Zhang et al 2017 ("Massively parallel digital transcriptional profiling of single cells") with Souporcell. Unfortunately, we got the following error:

imports done
checking bam for expected tags
checking fasta
restarting pipeline in existing directory 10X2017data/rundir_99-1
running souporcell clustering
Traceback (most recent call last):
  File "/opt/souporcell/souporcell_pipeline.py", line 446, in <module>
    souporcell(args, ref_mtx, alt_mtx)
  File "/opt/souporcell/souporcell_pipeline.py", line 387, in souporcell
    "-t", str(args.threads), "-l", args.max_loci, "--min_alt", args.min_alt, "--min_ref", args.min_ref,'--out',cluster_file],stdout=log,stderr=log)                              
  File "/opt/conda/envs/py36/lib/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['souporcell.py', '-a', '10X2017data/rundir_99-1/alt.mtx', '-r', '10X2017data/rundir_99-1/ref.mtx', '-b', '10X2017data/99-1/outs/filtered_feature_bc_matrix/barcodes.tsv', '-k', '2', '--restarts', '15', '-t', '8', '-l', '2048', '--min_alt', '4', '--min_ref', '4', '--out', '10X2017data/rundir_99-1/clusters_tmp.tsv']' returned non-zero exit status 1.

The 50-50 and 90-10 mixes from the same paper run smoothly. As a side note, when we tried "-k 1" on our data (to get the loss value with only one assumed donor), we also got an error (I did not write it down). Is there maybe some issue if the data or the parameters show souporcell only 1 cluster?

Best Julian

JulianRein avatar Oct 15 '19 13:10 JulianRein

Thanks. I have not thought about the one cluster case. I will look into that.

For the error, could you post what it says in the souporcell.log file in the output directory?

wheaton5 avatar Oct 15 '19 13:10 wheaton5

sourporcell_10Xdata.txt

sourporcell_ownData_1cluster.txt

Hey, sorry for the late reply, here is the output for the 2017-10X-99:1 data:

-> first attachement

For the case with "-k 1" for our own data, it seems different: Sourporcell.log ends like "usual"/with k=2, just with only 1 output-dim:

-> second attachement However, the 3 files usually created after "troublet.done" (namely ambient_rna.txt, cluster_genotypes.vcf, consensus.done") are missing, I guess it crashed there. So seems to not be the same problem.

JulianRein avatar Oct 20 '19 12:10 JulianRein

The problem from the first file is a bug I already fixed but have not put in the release version yet.

wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1LWKHs-nKXNH5vVkP8iJAgLquiE9FLMb5' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1LWKHs-nKXNH5vVkP8iJAgLquiE9FLMb5" -O souporcell.sif && rm -rf /tmp/cookies.txt

grab this container and it should fix it

wheaton5 avatar Oct 21 '19 14:10 wheaton5