drep icon indicating copy to clipboard operation
drep copied to clipboard

Unknown AssertionError

Open susheelbhanu opened this issue 2 years ago • 4 comments

Hey @MrOlm,

I'm running dRep via snakemake with the following:

Job 0: Running dRep on all NOMIS MAGs
Reason: Forced execution

(date && dRep dereplicate $(dirname /scratch/users/sbusi/nomis_mags/results/Bins/dRep/dereplicated_genomes) \
-p 28 -comp 70 -con 10 \
--genomeInfo /scratch/users/sbusi/nomis_mags/results/Bins/checkmBeforedRep.tsv \
-g /scratch/users/sbusi/nomis_mags/results/renamed_mags/*fa \
--multiround_primary_clustering --run_tertiary_clustering && date) 2> /scratch/users/sbusi/nomis_mags/results/logs/drep/drep.err.log > /scratch/users/sbusi/nomis_mags/results/logs/drep/drep.out.log
Activating conda environment: snakemake_envs/0a2d0324514328db8685fdb0c0b69b98

The log file shows an AssertionError without any hints similar to other issues previously reported. Please see below:

***************************************************
    ..:: dRep dereplicate Step 4. Evaluate ::..
***************************************************

Running tertiary clustering on genome representatives
Running primary clustering
Running pair-wise MASH clustering
1414 primary clusters made
Running secondary clustering
Running 3137 ANImf comparisons- should take ~ 56.0 min
Step 4. Return output
Loading work directory
Traceback (most recent call last):
  File "/mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/bin/dRep", line 32, in <module>
    Controller().parseArguments(args)
  File "/mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/lib/python3.8/site-packages/drep/controller.py", line 100, in parseArguments
    self.dereplicate_operation(**vars(args))
  File "/mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/lib/python3.8/site-packages/drep/controller.py", line 48, in dereplicate_operation
    drep.d_workflows.dereplicate_wrapper(kwargs['work_directory'],**kwargs)
  File "/mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/lib/python3.8/site-packages/drep/d_workflows.py", line 53, in dereplicate_wrapper
    drep.d_evaluate.d_evaluate_wrapper(wd, evaluate = '23', **kwargs)
  File "/mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/lib/python3.8/site-packages/drep/d_evaluate.py", line 25, in d_evaluate_wrapper
    run_tertiary_clustering(wd, **kwargs)
  File "/mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/lib/python3.8/site-packages/drep/d_evaluate.py", line 334, in run_tertiary_clustering
    drep.d_choose.d_choose_wrapper(wd.location, **kwargs_copy)
  File "/mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/lib/python3.8/site-packages/drep/d_choose.py", line 72, in d_choose_wrapper
    Gdb = add_centrality(wd, Gdb, **kwargs)
  File "/mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/lib/python3.8/site-packages/drep/d_choose.py", line 322, in add_centrality
    assert len(ndb) == (mlen * mlen) - mlen
AssertionError

drep.yaml.txt

Not sure if it has to do with the length of ndb. Any thoughts on how to get around this? I'm also attaching the drep.yaml file which i used to build my environment.

Thank you very much, Susheel

susheelbhanu avatar Jul 30 '22 11:07 susheelbhanu

Hi @susheelbhanu

Interesting. A couple of thoughts-

  1. Can you confirm that you're on the most up-to-date version of dRep? I remember this bug from a previous version but I thought I fixed it.

  2. Can you confirm that all dependencies are properly installed? dRep check_dependencies

  3. If neither of those work, I believe setting the centrality score to 0 should be a successful workaround

-Matt

MrOlm avatar Aug 02 '22 14:08 MrOlm

Hey @MrOlm,

Thanks much for the quick response. Please see the answers to your question below

  1. The version I'm using is: drep==3.2.2
                ...::: dRep v3.2.2 :::...

  Matt Olm. MIT License. Banfield Lab, UC Berkeley. 2017 (last updated 2020)

  See https://drep.readthedocs.io/en/latest/index.html for documentation
  Choose one of the operations below for more detailed help.

  Example: dRep dereplicate -h

  Commands:
    compare            -> Compare and cluster a set of genomes
    dereplicate        -> De-replicate a set of genomes
    check_dependencies -> Check which dependencies are properly installed
  1. For check_dependencies. I have the following:
mash.................................... all good        (location = /mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/bin/mash)
nucmer.................................. all good        (location = /mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/bin/nucmer)
checkm.................................. !!! ERROR !!!   (location = None)
ANIcalculator........................... !!! ERROR !!!   (location = None)
prodigal................................ all good        (location = /mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/bin/prodigal)
centrifuge.............................. !!! ERROR !!!   (location = None)
nsimscan................................ !!! ERROR !!!   (location = None)
fastANI................................. all good        (location = /mnt/lscratch/users/sbusi/SnakemakeBinning/snakemake_envs/0a2d0324514328db8685fdb0c0b69b98/bin/fastANI)

I did not install checkM since I was providing the quality metrics, and skipped centrifuge and simscan. It does however look like ANIcalculator which is "recommended" may in fact be essential for running the clustering steps. See my note below.

  1. How does one set the centrality score? Is it a flag I can use?

Note: By reducing the number of MAGs to less than 5000 and removing these two flags --multiround_primary_clustering --run_tertiary_clustering from the original shell command, everything ran just fine.

Thanks again for your help. I will put in the ANIcalculator and see if that resolves it with the original full set of MAGs. -Susheel

susheelbhanu avatar Aug 03 '22 05:08 susheelbhanu

Hi Susheel,

No need to install any of the other dependencies- those aren't needed for what you're running.

The two recommendations I have are to update to the latest version of dRep (v3.4.0), though I'm not sure this will fix the problem, and add the flag -centW 0 to remove the centrality scoring. If neither of those fix the problem we can go from there.

Best, Matt

MrOlm avatar Aug 03 '22 15:08 MrOlm

Thanks Matt. Is v3.4.0 on conda or pip? Will give this a go and get back. May take a bit though before I reply again.

susheelbhanu avatar Aug 04 '22 05:08 susheelbhanu