EukDetect
EukDetect copied to clipboard
Unicode error
I got this error when running setup:
Traceback (most recent call last):
File "setup.py", line 5, in <module>
long_description = fh.read()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 641: ordinal not in range(128)
to solve I did this
export LC_ALL=en_US.UTF-8
then re-ran and it worked.
I too received a unicode error, although at the stage of trying to run a subset of my actual samples (no errors arose during install, set-up, or testing as outlined). Below is the output from my terminal:
eukdetect --mode runall --configfile Metagenomics_I-Ching_Nachbac_configfile_TESTING_v1.yml
01/24/2023 18:06:24: Parsing config file ...
Traceback (most recent call last):
File "/share/jwaters/anaconda3/envs/eukdetect/bin/eukdetect", line 33, in <module>
sys.exit(load_entry_point('EukDetect==1.0.1', 'console_scripts', 'eukdetect')())
File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/site-packages/EukDetect-1.0.1-py3.6.egg/eukdetect/runall.py", line 140, in main
File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/site-packages/EukDetect-1.0.1-py3.6.egg/eukdetect/runall.py", line 454, in check_readlen
File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/site-packages/Bio/SeqIO/__init__.py", line 611, in parse
for r in i:
File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/site-packages/Bio/SeqIO/QualityIO.py", line 1033, in FastqPhredIterator
for title_line, seq_string, quality_string in FastqGeneralIterator(handle):
File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/site-packages/Bio/SeqIO/QualityIO.py", line 897, in FastqGeneralIterator
line = handle_readline()
File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 12: ordinal not in range(128)
Any advice on how to proceed would be greatly appreciated! I did try the 'export LC_ALL=en_US.UTF-8' that hyphaltip recommended, but that did not resolve my issue and I had the same error.
Thanks for reaching out. I can't reproduce this error and I've not been able to figure out a solution for now, though I'm still trying. It looks like the error is being thrown by BioPython in one of the check steps. Could you try running this as a snakemake pipeline directly? Full instructions are on the github, but in short it's snakemake --snakefile [path_to_install_folder]rules/eukdetect_eukfrac.rules --configfile [config file] --cores [cores] runall
. If that works that will be a workaround.
Thanks so much for your prompt reply! My apologies that I missed in the documentation that using snakemake can be a good workaround if running into issues with python.
I've tried running snakemake as you suggested, but I received the following errors while it was running. It did complete, but it seems the output files are empty.
/usr/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = "en_US.UTF-8",
LANG = "C.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("C.UTF-8").
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = "en_US.UTF-8",
LANG = "C.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("C.UTF-8").
Hi Abigail,
After adding quotes to try and change the local settings export LC_ALL="en_US.UTF-8"
that might have worked, or at least I am no longer receiving locale warnings (apologies that I did not correctly understand the error, I promise I tried to google it!).
However, now when I run the snakemake file, I receive this output to the screen and my output files again appear to be empty. Can you advise what may be going on?
Building DAG of jobs...
Nothing to be done.
Complete log: /share/jwaters/Metagenomics_I-Ching/EukDetect/.snakemake/log/2023-01-27T183421.100804.snakemake.log
Hi,
Thanks for your patience with a response. This suggests no steps of eukdetect ran second time because the snakemake process didn't notice any changes (missing the UTF encoding change). The best way to fix this is to delete the eukdetect output files, including the intermediate steps, and rerun the whole thing.