Hi, thanks for your tools!
I am running dRep v3.4.0 for dereplicating the genome, and I have installed the checkm, here is the err. Why does this happen?
..:: dRep dereplicate Step 1. Filter ::..
Will filter the genome list
3 genomes were input to dRep
Calculating genome info of genomes
Traceback (most recent call last):
File "/home/iozac/huangy1/miniconda2/envs/metagenomic/bin/dRep", line 32, in
Controller().parseArguments(args)
File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/controller.py", line 100, in parseArguments
self.dereplicate_operation(**vars(args))
File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/controller.py", line 48, in dereplicate_operation
drep.d_workflows.dereplicate_wrapper(kwargs['work_directory'],**kwargs)
File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/d_workflows.py", line 29, in dereplicate_wrapper
drep.d_filter.d_filter_wrapper(wd, genomes = genomes, Chdb = Chdb, **kwargs)
File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/d_filter.py", line 73, in d_filter_wrapper
Gdb = calc_genome_info(bdb['location'].tolist())
File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/d_filter.py", line 275, in calc_genome_info
table['length'].append(calc_fasta_length(loc))
File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/drep/d_filter.py", line 655, in calc_fasta_length
for seq_record in SeqIO.parse(fasta_loc, "fasta"):
File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/Bio/SeqIO/Interfaces.py", line 74, in next
return next(self.records)
File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/Bio/SeqIO/FastaIO.py", line 206, in iterate
Seq(sequence), id=first_word, name=first_word, description=title,
File "/home/iozac/huangy1/miniconda2/envs/metagenomic/lib/python3.8/site-packages/Bio/Seq.py", line 1725, in init
self._data = bytes(data, encoding="ASCII")
UnicodeEncodeError: 'ascii' codec can't encode character '\u21b5' in position 4397: ordinal not in range(128)
Thanks!
It seems like your input genomes may be compressed. The only compression supported by dRep is gzip
compression, and if you do that the genome filename must end in .gz
my input genomes do not compressed, and the genome filename end in .fa
This error is occurring because at least one of the files you are including as an input to dRep is not encoded by normal ASCII characters. Maybe you're accidentally including a non-genome file in your input command?