barrnap
barrnap copied to clipboard
nhmmer failed to run - Error: Invalid alphabet type in target for nhmmer. Expect DNA or RNA
Hello I try to run barrnap to identify rRNA from a eukaryotic genome , the commad as follow: barrnap --kingdom euk --threads 20 --outseq rRNA.fasta < chr1.fasta
After running, we got following error . Can you supply suggestions to solve this problem? Thanks! [barrnap] This is barrnap 0.9 [barrnap] Written by Torsten Seemann [barrnap] Obtained from https://github.com/tseemann/barrnap [barrnap] Detected operating system: linux [barrnap] Adding /miniconda3/lib/barrnap/bin/../binaries/linux to end of PATH [barrnap] Checking for dependencies: [barrnap] Found nhmmer - /miniconda3/bin/nhmmer [barrnap] Found bedtools -/miniconda3/bin/bedtools [barrnap] Will use 20 threads [barrnap] Setting evalue cutoff to 1e-06 [barrnap] Will tag genes < 0.8 of expected length. [barrnap] Will reject genes < 0.25 of expected length. [barrnap] Using database: /miniconda3/lib/barrnap/bin/../db/euk.hmm [barrnap] Scanning chr1.fasta for euk rRNA genes... please wait [barrnap] Command: nhmmer --cpu 20 -E 1e-06 --w_length 3878 -o /dev/null --tblout /dev/stdout '/miniconda3/lib/barrnap/bin/../db/euk.hmm' 'chr1.fasta' [barrnap] ERROR: nhmmer failed to run - Error: Invalid alphabet type in target for nhmmer. Expect DNA or RNA.
I am sure there are no other alphabets in the fasta sequence except A/T/C/G.
I've gotten the same error. For me, what caused the error was one sequence composed entirely of G and T nucleotides. Adding a single A and C nucleotide resulted in no error. This should be an easy :-)
I also just ran into this problem as well.
I also just ran into this problem as well. I added A and same problem. The Internet said it was a problem with the conda installation.
I just stumbled upon this; in case this is still relevant @zxgsy520 there is a switch to set the alphabet type for the query and use this as "guide" in case the alphabet type cannot be guessed for the target; --dna
introduced in this PR
https://github.com/EddyRivasLab/hmmer/pull/252
The switch is available in nhmmer v3.3.2 installed via conda
correction: the fix in the PR has only been merged into the dev branch, the switch --dna
exists in latest release but does not include the fix
I bypassed this issue by replacing all ambiguous bases (M, K, H, et al.) to N.
I got this issue for a genome that started with a telomere repeat and did not have all four bases in the first few hundred characters. I got around it by replacing the first four characters of each sequence with GATC and then running on the temporary file:
PREFIX=$(basename ${GENOME/.fasta/})
sed 's/^[ACGT][ACGT][ACGT][ACGT]/GATC/' $GENOME > $PREFIX.tmp.fasta