diamond icon indicating copy to clipboard operation
diamond copied to clipboard

"Error: Array size overflow" when using Diamond on large custom database

Open CaroleBelliardo opened this issue 7 months ago • 1 comments

Hi BBuchfink,

I regularly use this tool, and it usually works fine. I'm encountering an issue while using Diamond on a custom database that is 609 GB in size and includes integrated taxonomic information. Previously, my job ran smoothly on an earlier 607 GB database version that included only some of the taxonomic information.

I did not receive any error messages during the database formatting process. However, after running a blastp analysis against the same proteome, I get the following error message after 3 minutes: “Error: Array size overflow”.

Could you please help me resolve this issue?

My diamond log file contain:

diamond blastp --more-sensitive --max-target-seqs 500 --evalue 0.001 --threads 70 --db /kwak/hub/25_cbelliardo/DB_Metag_Soildb/SoilDB_nr_v4.dmnd --query XINDMERG.20240529.recipeA.20240529.prot.fasta --out XINDMERG.20240529.recipeA.20240529.prot_diamond_taxids_exclude_metag.tsv --taxon-exclude 46003 -b 10 -c 1 --tmpdir . --log --verbose --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore staxids
#CPU threads: 70
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
CPU features detected: ssse3 popcnt sse4.1 avx2
L3 cache size: 47185920
MAX_SHAPE_LEN=19 SEQ_MASK STRICT_BAND
Temporary directory: .
#Target sequences to report alignments for: 500
DP fields: 510
Opening the database... Error: Array size overflow.

CaroleBelliardo avatar Jul 18 '24 16:07 CaroleBelliardo