drep icon indicating copy to clipboard operation
drep copied to clipboard

UnicodeEncodeError: 'ascii' codec can't encode character '\u21b5' in position 4397: ordinal not in range(128)

Open yuemo98 opened this issue 2 years ago • 3 comments

Hi, thanks for your tools!

I am running dRep v3.4.0 for dereplicating the genome, and I have installed the checkm, here is the err. Why does this happen?


..:: dRep dereplicate Step 1. Filter ::..

Will filter the genome list 3 genomes were input to dRep Calculating genome info of genomes Traceback (most recent call last): File "/home/iozac/huangy1/miniconda2/envs/metagenomic/bin/dRep", line 32, in Controller().parseArguments(args) File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/controller.py", line 100, in parseArguments self.dereplicate_operation(**vars(args)) File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/controller.py", line 48, in dereplicate_operation drep.d_workflows.dereplicate_wrapper(kwargs['work_directory'],**kwargs) File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/d_workflows.py", line 29, in dereplicate_wrapper drep.d_filter.d_filter_wrapper(wd, genomes = genomes, Chdb = Chdb, **kwargs) File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/d_filter.py", line 73, in d_filter_wrapper Gdb = calc_genome_info(bdb['location'].tolist()) File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/d_filter.py", line 275, in calc_genome_info table['length'].append(calc_fasta_length(loc)) File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/d_filter.py", line 655, in calc_fasta_length for seq_record in SeqIO.parse(fasta_loc, "fasta"): File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/Bio/SeqIO/Interfaces.py", line 74, in next return next(self.records) File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/Bio/SeqIO/FastaIO.py", line 206, in iterate Seq(sequence), id=first_word, name=first_word, description=title, File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/Bio/Seq.py", line 1725, in init self._data = bytes(data, encoding="ASCII") UnicodeEncodeError: 'ascii' codec can't encode character '\u21b5' in position 4397: ordinal not in range(128)

Thanks!

yuemo98 avatar Aug 30 '22 18:08 yuemo98

It seems like your input genomes may be compressed. The only compression supported by dRep is gzip compression, and if you do that the genome filename must end in .gz

MrOlm avatar Aug 30 '22 18:08 MrOlm

my input genomes do not compressed, and the genome filename end in .fa

yuemo98 avatar Aug 31 '22 00:08 yuemo98

This error is occurring because at least one of the files you are including as an input to dRep is not encoded by normal ASCII characters. Maybe you're accidentally including a non-genome file in your input command?

MrOlm avatar Aug 31 '22 02:08 MrOlm