diamond icon indicating copy to clipboard operation
diamond copied to clipboard

Unrecognized sequence character in BLAST database

Open anearman opened this issue 1 year ago • 1 comments

Hello!

I am running a query on a single organism proteome against the preformatted blast nr database. I can prep the database with no issues. When I run my query with only basic input and output parameters, I get the following error at different time intervals on repeated attempts:

Current RSS: 2.4 GB, Peak RSS: 4.3 GB Initializing dictionary... [0.007s] Initializing temporary storage... Async_buffer() 10283,643 [0s] Building reference histograms... terminate called after throwing an instance of 'std::runtime_error' what(): Unrecognized sequence character in BLAST database Aborted (core dumped)

I have downloaded the database multiple times, using both update_blastdb and direct ftp. I have reinstalled diamond with the precompiled binaries. I have a 32 core Intel with 128GB ram. Any suggestions?

Thank you!

anearman avatar Aug 07 '24 13:08 anearman

I tested this with the latest nr database and could not reproduce the error. Maybe it was fixed by now by NCBI?

bbuchfink avatar Dec 27 '24 09:12 bbuchfink