EDTA icon indicating copy to clipboard operation
EDTA copied to clipboard

ERROR in TE annotation stats

Open shiyi-pan opened this issue 4 months ago • 9 comments

Hi, oushujun, thank you for develop this great tool for genome repeat annotation. I want to use EDTA to annotate my genome and met an error.

I install EDTA v2.1.3 by mamba with following script ( I can't install the latest version for server configuration): mamba env create -f EDTA.yml -p /gss1/home/ruanjian/EDTA Here is the script used to annotate my genome: perl /gss1/home//c.annotation/a.TEs_annotation/EDTA/EDTA.pl --genome long.fa --species others --step all --overwrite 1 --threads 16 --sensitive 1 --anno 1 --evaluate 1

Here is the error I met:

Massive resequencing efforts have been undertaken to catalog allelic variants in major crop species including soybean, but the scope of the information for genetic variation often depends on short sequence reads mapped to the extant reference genome. Additional de novo assembled genome sequences provide a unique opportunity to explore a dispensable genome fraction in the pan-genome of a species. Here, we report the de novo assembly and annotation of Hwangkeum, a popular soybean cultivar in Korea. The assembly was constructed using PromethION nanopore sequencing data and two genetic maps and was then error-corrected using Illumina short-reads and PacBio SMRT reads. The 933.12Mb assembly was annotated as containing 79,870 transcripts for 58,550 genes using RNA-Seq data and the public soybean annotation set. Comparison of the Hwangkeum assembly with the Williams 82 soybean reference genome sequence (Wm82.a2.v1) revealed 1.8 million single-nucleotide polymorphisms, 0.5 million indels, and 25 thousand putative structural variants. However, there was no natural megabase-scale chromosomal rearrangement. Incidentally, by adding two novel subfamilies, we found that soybean contains four clearly separated subfamilies of centromeric satellite repeats. Analyses of satellite repeats and gene content suggested that the Hwangkeum assembly is a high-quality assembly. This was further supported by comparison of the marker arrangement of anthocyanin biosynthesis genes and of gene arrangement at the Rsv3 locus. Therefore, the results indicate that the de novo assembly of Hwangkeum is a valuable additional reference genome resource for characterizing traits for the

GFF> line 7. Use of uninitialized value $extra in substitution (s///) at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/gff2bed.pl line 101, <GFF> line 7. Use of uninitialized value $extra in pattern match (m//) at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/gff2bed.pl line 102, <GFF> line 7. Use of uninitialized value $element_end in concatenation (.) or string at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/gff2bed.pl line 110, <GFF> line 7. Use of uninitialized value $TE_class in concatenation (.) or string at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/gff2bed.pl line 110, <GFF> line 7. Use of uninitialized value $method in concatenation (.) or string at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/gff2bed.pl line 110, <GFF> line 7. Use of uninitialized value $score in concatenation (.) or string at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/gff2bed.pl line 110, <GFF> line 7. Use of uninitialized value $strand in concatenation (.) or string at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/gff2bed.pl line 110, <GFF> line 7. Use of uninitialized value $phase in concatenation (.) or string at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/gff2bed.pl line 110, <GFF> line 7. Use of uninitialized value $type in concatenation (.) or string at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/gff2bed.pl line 110, <GFF> line 7. Argument "Binary:matches.." isn't numeric in numeric gt (>) at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/split_overlap.pl line 26, <IN> line 1. Argument "matches" isn't numeric in numeric gt (>) at /gss1/home//c.annotation/a.TEs_annotation/EDTA/util/split_overlap.pl line 26, <IN> line 1. Warning: LOC list - is empty.

Count all-versus-all misclassifications using the cleanup_nested.pl .stat file perl count_nested.pl -in sequence.fa.stat -cat [redun|nested|all] > sequence.fa.stat.sum

Count all-versus-all misclassifications using the cleanup_nested.pl .stat file perl count_nested.pl -in sequence.fa.stat -cat [redun|nested|all] > sequence.fa.stat.sum

Count all-versus-all misclassifications using the cleanup_nested.pl .stat file perl count_nested.pl -in sequence.fa.stat -cat [redun|nested|all] > sequence.fa.stat.sum ERROR: TE annotation stats results not found in long.fa.mod.EDTA.TE.fa.stat!

Could you help me fix this problem, thank you very much.

shiyi-pan avatar Feb 29 '24 13:02 shiyi-pan