barrnap icon indicating copy to clipboard operation
barrnap copied to clipboard

nhmmer failed to run - Error: Invalid alphabet type in target for nhmmer. Expect DNA or RNA

Open minjinhan opened this issue 3 years ago • 6 comments

Hello I try to run barrnap to identify rRNA from a eukaryotic genome , the commad as follow: barrnap --kingdom euk --threads 20 --outseq rRNA.fasta < chr1.fasta

After running, we got following error . Can you supply suggestions to solve this problem? Thanks! [barrnap] This is barrnap 0.9 [barrnap] Written by Torsten Seemann [barrnap] Obtained from https://github.com/tseemann/barrnap [barrnap] Detected operating system: linux [barrnap] Adding /miniconda3/lib/barrnap/bin/../binaries/linux to end of PATH [barrnap] Checking for dependencies: [barrnap] Found nhmmer - /miniconda3/bin/nhmmer [barrnap] Found bedtools -/miniconda3/bin/bedtools [barrnap] Will use 20 threads [barrnap] Setting evalue cutoff to 1e-06 [barrnap] Will tag genes < 0.8 of expected length. [barrnap] Will reject genes < 0.25 of expected length. [barrnap] Using database: /miniconda3/lib/barrnap/bin/../db/euk.hmm [barrnap] Scanning chr1.fasta for euk rRNA genes... please wait [barrnap] Command: nhmmer --cpu 20 -E 1e-06 --w_length 3878 -o /dev/null --tblout /dev/stdout '/miniconda3/lib/barrnap/bin/../db/euk.hmm' 'chr1.fasta' [barrnap] ERROR: nhmmer failed to run - Error: Invalid alphabet type in target for nhmmer. Expect DNA or RNA.

I am sure there are no other alphabets in the fasta sequence except A/T/C/G.

minjinhan avatar Jan 16 '21 16:01 minjinhan

I've gotten the same error. For me, what caused the error was one sequence composed entirely of G and T nucleotides. Adding a single A and C nucleotide resulted in no error. This should be an easy :-)

snayfach avatar Feb 01 '21 01:02 snayfach

I also just ran into this problem as well.

jdwinkler-lanzatech avatar Aug 18 '21 13:08 jdwinkler-lanzatech

I also just ran into this problem as well. I added A and same problem. The Internet said it was a problem with the conda installation.

zxgsy520 avatar Jan 12 '22 08:01 zxgsy520

I just stumbled upon this; in case this is still relevant @zxgsy520 there is a switch to set the alphabet type for the query and use this as "guide" in case the alphabet type cannot be guessed for the target; --dna introduced in this PR https://github.com/EddyRivasLab/hmmer/pull/252 The switch is available in nhmmer v3.3.2 installed via conda

correction: the fix in the PR has only been merged into the dev branch, the switch --dna exists in latest release but does not include the fix

ptrebert avatar Mar 17 '22 10:03 ptrebert

I bypassed this issue by replacing all ambiguous bases (M, K, H, et al.) to N.

ZeweiSong avatar Apr 19 '22 01:04 ZeweiSong

I got this issue for a genome that started with a telomere repeat and did not have all four bases in the first few hundred characters. I got around it by replacing the first four characters of each sequence with GATC and then running on the temporary file:

PREFIX=$(basename ${GENOME/.fasta/})
sed 's/^[ACGT][ACGT][ACGT][ACGT]/GATC/' $GENOME > $PREFIX.tmp.fasta

cabbagesofdoom avatar Jun 06 '23 14:06 cabbagesofdoom