CheckM2 icon indicating copy to clipboard operation
CheckM2 copied to clipboard

Error in DIAMOND execution

Open KateSakharova opened this issue 1 year ago • 2 comments

Hello, I bumbed into problem with Diamond

INFO: Running CheckM2 version 1.0.1
INFO: Custom database path provided for predict run. Checking database at uniref100.KO.1.dmnd...
INFO: Running quality prediction workflow with 8 threads.
INFO: Calling genes in 1 bins with 8 threads:
    Finished processing 1 of 1 (100.00%) bins.
INFO: Calculating metadata for 1 bins with 8 threads:
    Finished processing 1 of 1 (100.00%) bin metadata.
INFO: Annotating input genomes with DIAMOND using 8 threads
INFO: Processing DIAMOND output
ERROR: No DIAMOND annotation was generated. Exiting

execution command: singularity run quay.io-biocontainers-checkm2-1.0.1--pyh7cba7a3_0.img checkm2 predict --threads 8 --input bins -x fa --output-directory binner13_checkm_output --database_path uniref100.KO.1.dmnd

bins folder contains 1 bin.fa (attached) bins.fa.gz

Should checkm2 generate empty output in that case? Could you explain what is wrong with DIAMOND execution?

Thanks! Best, Kate

KateSakharova avatar Feb 07 '24 11:02 KateSakharova

Hi,

The problem here lies with translation when using prodigal - the ~940Kb bin generates only 31 predicted proteins using prodigal, most of which are tiny. As a result, DIAMOND cannot confidently assign any KEGG ID's to any of the protein fragments predicted, generates no output, which leads to the CheckM2 error as that's the only bin in the input. Is it possible the bin contains non-prokaryotic DNA that doesn't play well with prodigal?

chklovski avatar Feb 08 '24 00:02 chklovski

Hi @chklovski, Thank you for your answer! I have some bins from the same Run annotated as eukaryotic MAG. You assumption might be correct. I will deeply look into this.

Kate

KateSakharova avatar Feb 14 '24 19:02 KateSakharova