diamond icon indicating copy to clipboard operation
diamond copied to clipboard

UnicodeDecodeError: 'utf-8' codec can't decode

Open leonhardt913 opened this issue 3 years ago • 1 comments

I ran the basic function in CMD and it showed the following error messages:

D:\file>emapper.py -i test.fasta -o out
#  emapper-2.1.9
# emapper.py  -i test.fasta -o out
[1;33m  d:\python37\lib\site-packages\eggnogmapper\bin\diamond blastp -d d:\python37\lib\site-packages\data\eggnog_proteins.dmnd -q D:\file\test.fasta --threads 1 -o D:\file\out.emapper.hits --tmpdir D:\file\emappertmp_dmdn_go7mwj8a --sensitive --iterate -e 0.001 --top 3  --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qcovhsp scovhsp[0m
Traceback (most recent call last):
  File "d:\python37\lib\site-packages\eggnogmapper\search\diamond\diamond.py", line 262, in run_diamond
    completed_process = subprocess.run(cmd, capture_output=True, check=True, shell=True)
  File "d:\python37\lib\subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'd:\python37\lib\site-packages\eggnogmapper\bin\diamond blastp -d d:\python37\lib\site-packages\data\eggnog_proteins.dmnd -q D:\file\test.fasta --threads 1 -o D:\file\out.emapper.hits --tmpdir D:\file\emappertmp_dmdn_go7mwj8a --sensitive --iterate -e 0.001 --top 3  --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qcovhsp scovhsp' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Python37\Scripts\emapper.py", line 704, in <module>
    n, elapsed_time = emapper.run(args, args.input, args.annotate_hits_table, args.cache_file)
  File "d:\python37\lib\site-packages\eggnogmapper\emapper.py", line 338, in run
    searcher, searcher_name, hits, queries_file = self.search(args, infile, predictor)
  File "d:\python37\lib\site-packages\eggnogmapper\emapper.py", line 167, in search
    raise(e)
  File "d:\python37\lib\site-packages\eggnogmapper\emapper.py", line 164, in search
    pjoin(self._current_dir, self.search_out_file))
  File "d:\python37\lib\site-packages\eggnogmapper\search\diamond\diamond.py", line 179, in search
    raise e
  File "d:\python37\lib\site-packages\eggnogmapper\search\diamond\diamond.py", line 155, in search
    cmds = self.run_diamond(in_file, hits_file)
  File "d:\python37\lib\site-packages\eggnogmapper\search\diamond\diamond.py", line 265, in run_diamond
    raise EmapperException("Error running diamond: "+cpe.stderr.decode("utf-8").strip().split("\n")[-1])
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb2 in position 57: invalid start byte

My system is Windows 10 [10.0.19042.1052]. I downloaded and unzipped eggnog.db and eggnog_proteins.dmnd manually and put it in "Python37\Lib\site-packages\data"

My fasta file does not containg "0xb2" which is "ᾲ", I am not sure where it comes from, and whether the coding is the main problem or not.

Does anyone encounter same issue and know how to fix it? Thanks!

update:

I used another Windows PC and found and got another problem:

C:\Python38\Scripts>python emapper.py -i test.fa -o out
#  emapper-2.1.9
# emapper.py  -i test.fa -o out
[1;33m  C:\Python38\lib\site-packages\eggnogmapper\bin\diamond blastp -d C:\Python38\lib\site-packages\data\eggnog_proteins.dmnd -q C:\Python38\Scripts\test.fa --threads 1 -o C:\Python38\Scripts\out.emapper.hits --tmpdir C:\Python38\Scripts\emappertmp_dmdn_2rld30tt --sensitive --iterate -e 0.001 --top 3  --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qcovhsp scovhsp[0m
Error running diamond: operable program or batch file.

leonhardt913 avatar Dec 14 '22 09:12 leonhardt913

It looks like diamond failed but there is no console output indicating the error. Try to run diamond again in an isolated way so you can see its console output.

As for Windows, you need to download the Windows version of diamond available under releases.

bbuchfink avatar Dec 19 '22 13:12 bbuchfink