Unrecognized sequence character in BLAST database
Hello!
I am running a query on a single organism proteome against the preformatted blast nr database. I can prep the database with no issues. When I run my query with only basic input and output parameters, I get the following error at different time intervals on repeated attempts:
Current RSS: 2.4 GB, Peak RSS: 4.3 GB Initializing dictionary... [0.007s] Initializing temporary storage... Async_buffer() 10283,643 [0s] Building reference histograms... terminate called after throwing an instance of 'std::runtime_error' what(): Unrecognized sequence character in BLAST database Aborted (core dumped)
I have downloaded the database multiple times, using both update_blastdb and direct ftp. I have reinstalled diamond with the precompiled binaries. I have a 32 core Intel with 128GB ram. Any suggestions?
Thank you!
I tested this with the latest nr database and could not reproduce the error. Maybe it was fixed by now by NCBI?