EDTA icon indicating copy to clipboard operation
EDTA copied to clipboard

Use of uninitialized value $lLTR_start ($rLTR_end, $chr)

Open kingforest93 opened this issue 2 years ago • 4 comments

Hi shujun,

Thanks for your excellent work in developing EDTA as an Extensive de-novo TE Annotator for the research community. I am using EDTA to annotate interspersed repeat in a de-novo assembled plant genome, and in particular, to build a non-redundant TE library for the following analysis.

I used conda to install the dependencies of EDTA and then cloned the source code of EDTA v1.9.9, and the test run OK and the produced results were exactly as expected. Further, I used part of the assembled contigs (total length ranging from 1-Mb to 500-Mb) to estimate the run time of EDTA. The command and options of runing EDTA were as follows: /usr/bin/time -v EDTA.pl --genome JJ_100Mb.fa --sensitive 1 --overwrite 1 --force 1 --threads 20

EDTA pipeline run well with the input contigs from 1-Mb to 100-Mb and finished without ERROR. There were two WARNING in the following that should not affect the results.

-WARNING- Grid computing is not available because DRMAA not configured properly: Could not find drmaa library. Please specify its full path using the environment variable DRMAA_LIBRARY_PATH BiopythonWarning: Partial codon, len(sequence) not a multiple of three. Explicitly trim the sequence or add trailing N before translation. This may become an error in future.

When the contigs 500-Mb were used as input for EDTA, the following WARNING and ERROR appeared. Despite this, the pipeline continued and finished (except for the unfinished remaining TE discovery by RepeatModeler), but I was not sure whether these ERROR affect the final results.

Use of uninitialized value $lLTR_start in numeric lt (<) at path_to_anaconda3/envs/EDTA/share/LTR_retriever/bin/make_gff3.pl line 55, <List> line 2. Use of uninitialized value $rLTR_end in numeric lt (<) at path_to_anaconda3/envs/EDTA/share/LTR_retriever/bin/make_gff3.pl line 55, <List> line 2. Use of uninitialized value $chr in concatenation (.) or string at path_to_anaconda3/envs/EDTA/share/LTR_retriever/bin/make_gff3.pl line 63, <List> line 2. Use of uninitialized value $chr_ori in exists at path_to_anaconda3/envs/EDTA/share/LTR_retriever/bin/make_gff3.pl line 72, <List> line 2. ... Use of uninitialized value $chr_pre in hash element at path_to_anaconda3/envs/EDTA/share/EDTA/util/call_seq_by_list.pl line 90. ERROR: Can not recognize this MSU position in the list! Use of uninitialized value $lLTR_start in numeric lt (<) at path_to_anaconda3/envs/EDTA/share/EDTA/util/rename_LTR_skim.pl line 49, <Seq> chunk 1. Use of uninitialized value $rLTR_end in numeric lt (<) at path_to_anaconda3/envs/EDTA/share/EDTA/util/rename_LTR_skim.pl line 49, <Seq> chunk 1. Illegal division by zero at path_to_anaconda3/envs/EDTA/share/EDTA/util/cleanup_tandem.pl line 100, <File> chunk 1. ... Use of uninitialized value $info in pattern match (m//) at path_to_anaconda3/envs/EDTA/share/EDTA/util/filter_gff3.pl line 38, <GFF> line 6. ... Use of uninitialized value $seq_new in substr at path_to_anaconda3/envs/EDTA/share/EDTA/util/cleanup_nested.pl line 190. Thread 30 terminated abnormally: substr outside of string at path_to_anaconda3/envs/EDTA/share/EDTA/util/cleanup_nested.pl line 190.

I also tried EDTA v1.9.6 (installed by my colleague and worked well for several genomes ~ 1-Gb), but the above WARNING and ERROR still occured. Did these affect the final results?

wang sen

EDTA runing commands and log are attached: edta.500Mb.zip

kingforest93 avatar Oct 08 '21 07:10 kingforest93

Dear Sen,

Sorry for the delayed reply. You may not use the --force 1 parameter unless you fail on the first step (EDTA_raw). Some fungus and animal genomes may need to this parameter but rarely for a plant genome. Please try the 500-mb sample without this parameter.

You may use the latest version by cloning the github to your work directory. The clone can be used under your conda env of EDTA.

Best, Shujun

oushujun avatar Oct 22 '21 05:10 oushujun

Dear Shujun,

Thanks for your reminder and suggestion. I'll try the latest version without the option --force 1.

Best regards,

Sen

At 2021-10-22 13:10:00, "Shujun Ou" @.***> wrote:

Dear Sen,

Sorry for the delayed reply. You may not use the --force 1 parameter unless you fail on the first step (EDTA_raw). Some fungus and animal genomes may need to this parameter but rarely for a plant genome. Please try the 500-mb sample without this parameter.

You may use the latest version by cloning the github to your work directory. The clone can be used under your conda env of EDTA.

Best, Shujun

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

kingforest93 avatar Oct 22 '21 06:10 kingforest93

@kingforest93 How did it go?

oushujun avatar Nov 10 '21 00:11 oushujun

@kingforest93 How did it go?

Dear Shujun: I have tested EDTA for small genomes whose size is less than 1G and the outputs are OK except for the "WARNNING" of DRMAA. But I encountered the same "ERROR" when trying EDTA for a 4.8 Gb high-quality(T2T) genome: Use of uninitialized value $ac in substitution (s///) at /data1/home/beirui/anaconda3/envs/EDTA/share/LTR_retriever/bin/LTR.identifier.pl line 286. Use of uninitialized value $ac in substitution (s///) at /data1/home/beirui/anaconda3/envs/EDTA/share/LTR_retriever/bin/LTR.identifier.pl line 286. Use of uninitialized value $bd in substitution (s///) at /data1/home/beirui/anaconda3/envs/EDTA/share/LTR_retriever/bin/LTR.identifier.pl line 287. Use of uninitialized value $bd in substitution (s///) at /data1/home/beirui/anaconda3/envs/EDTA/share/LTR_retriever/bin/LTR.identifier.pl line 287. Use of uninitialized value $ac in pattern match (m//) at /data1/home/beirui/anaconda3/envs/EDTA/share/LTR_retriever/bin/LTR.identifier.pl line 289. Use of uninitialized value $bd in pattern match (m//) at /data1/home/beirui/anaconda3/envs/EDTA/share/LTR_retriever/bin/LTR.identifier.pl line 289. Use of uninitialized value $ac in pattern match (m//) at /data1/home/beirui/anaconda3/envs/EDTA/share/LTR_retriever/bin/LTR.identifier.pl line 289. The command is EDTA.pl --genome jm_A.fa --species others --sensitive 1 --anno 1 --threads 20. Should I split the genome manually to keep the input size below 1G? Thank you very much for your consideration. Beirui

BRNiu avatar Sep 20 '22 12:09 BRNiu

@BRNiu It has been a while... I hope this issue is solved. Please reopen the issue if not. Thank you! - Shujun

oushujun avatar Jan 08 '24 06:01 oushujun