EDTA fails at TIR step
Hi Shujun, I am having an issue maybe related to two other TIR-related issues this week (https://github.com/oushujun/EDTA/issues/572, https://github.com/oushujun/EDTA/issues/573). I ran EDTA and got stuck on the TIR step.
From the error message I am not sure if there is a software issue (the warning about a Seq object), or whether there were no TIR sequences identified: TypeError: The sequence data given to a Seq object should be a string (not another Seq object etc). I understand that --force 1 is the default from the help diaglog when running EDTA.pl with no arguments, so it should have continued, right?
I installed EDTA earlier this week with mamba install -c conda-forge -c bioconda edta. Using Python 3.11.11 and Extensive de-novo TE Annotator (EDTA) v2.2.2, build hdfd78af_1, channel bioconda.
The command I ran was:
EDTA.pl --genome sample_input.fasta --species others --cds longest_orfs.fasta --step all --overwrite 1 --anno 1 --threads 16
Mon Jun 9 11:11:35 PM CEST 2025 Identify LINE retrotransposon candidates from scratch.
Use of uninitialized value in string ne at /path/to/envs/EDTA/share/EDTA/bin/cleanup_misclas.pl line 61, <Clas> line 9.
Use of uninitialized value within %lib in string ne at /path/to/envs/EDTA/share/EDTA/bin/cleanup_misclas.pl line 61, <Clas> line 9.
Tue Jun 10 04:09:44 AM CEST 2025 Finish finding LINE candidates.
Tue Jun 10 04:09:44 AM CEST 2025 Start to find TIR candidates.
Tue Jun 10 04:09:44 AM CEST 2025 Identify TIR candidates from scratch.
Species: others
Traceback (most recent call last):
File "/path/to/envs/EDTA/share/TIR-Learner3/TIR-Learner.py", line 119, in <module>
main()
File "/path/to/envs/EDTA/share/TIR-Learner3/TIR-Learner.py", line 116, in main
TIRLearner_instance.execute()
File "/path/to/envs/EDTA/share/TIR-Learner3/app/main.py", line 319, in execute
self.execute_m4()
File "/path/to/envs/EDTA/share/TIR-Learner3/app/main.py", line 803, in execute_m4
self["TIRvish"] = run_TIRvish.execute(self)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/envs/EDTA/share/TIR-Learner3/app/run_TIRvish.py", line 264, in execute
df = _run_TIRvish_py_para(genome_file, genome_name, TIR_length, processors, flag_debug, gt_path,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/envs/EDTA/share/TIR-Learner3/app/run_TIRvish.py", line 229, in _run_TIRvish_py_para
fasta_files_path_list.extend(process_fasta(genome_file, MP_SPLIT_SEQ_LEN, MP_OVERLAP_SEQ_LEN))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/envs/EDTA/share/TIR-Learner3/app/run_TIRvish.py", line 118, in process_fasta
segments = split_sequence_evenly(seq_record, split_seq_len, overlap_seq_len)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/envs/EDTA/share/TIR-Learner3/app/run_TIRvish.py", line 61, in split_sequence_evenly
records.append(create_sequence_record(segment_seq, part_id))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/envs/EDTA/share/TIR-Learner3/app/run_TIRvish.py", line 32, in create_sequence_record
return SeqRecord(Seq(seq), id=id, description="")
^^^^^^^^
File "/path/to/envs/EDTA/lib/python3.11/site-packages/Bio/Seq.py", line 95, in __init__
raise TypeError(
TypeError: The sequence data given to a Seq object should be a string (not another Seq object etc)
Can't open ./TIR-Learner-Result/TIR-Learner_FinalAnn.fa: No such file or directory at /path/to/envs/EDTA/share/EDTA/bin/rename_tirlearner.pl line 19.
Warning: LOC list sample_input.fasta.mod.TIR.ext30.list is empty.
Error: Error while loading sequence
Filter sequence based on TEsorter classifications. Unclassified sequences will also be output to the clean file.
Usage: perl cleanup_misclas.pl sequence.fa.rexdb.cls.tsv
Author: Shujun Ou ([email protected]) 10/11/2019
mv: cannot stat 'sample_input.fasta.mod.TIR.ext30.fa.pass.fa.dusted.cln.cln': No such file or directory
cp: cannot stat 'sample_input.fasta.mod.TIR.ext30.fa.pass.fa.dusted.cln.cln.list': No such file or directory
cp: cannot stat 'sample_input.fasta.mod.TIR.intact.raw.fa.anno.list': No such file or directory
Can't open ./TIR-Learner-Result/TIR-Learner_FinalAnn.gff3: No such file or directory.
ERROR: No such file or directory at /path/to/envs/EDTA/share/EDTA/bin/output_by_list.pl line 39.
Error: TIR results not found!
ERROR: Raw TIR results not found in sample_input.fasta.mod.EDTA.raw/sample_input.fasta.mod.TIR.intact.raw.fa
If you believe the program is working properly, this may be caused by the lack of intact TIRs in your genome. Consider to use the --force 1 parameter to overwrite this check
Hello,
Please double check your input genome file. Please also test your installation with the test data.
Shujun
It was my mistake, I misread the --help dialogue! The pipeline worked with --force 1. Made a suggestion via PR. Thank you!
Thanks for the suggestion! Multicellular organisms usually have TIR elements. If your genome is a plant or animal, you should expect TIRs.
Thanks, @oushujun - that is something I will have to follow up on with this genome!! I appreciate you folding in the PR.
Hi, @conchoecia , I have also encountered the same problem as you. Should I add --force 1 in my command?