EDTA icon indicating copy to clipboard operation
EDTA copied to clipboard

Structural TEs appear not to be renamed by panEDTA

Open joannarifkin opened this issue 8 months ago • 6 comments

Hi Shujun,

Another question. I'm running panEDTA on a bunch of species. It says it's successfully reannotating the structurally annotated TEs, e.g.:

Sat Oct 14 15:10:05 EDT 2023 EDTA final stage finished! You may check out: The final EDTA TE library: Cviolacea_585_v2.0.fa.mod.EDTA.TElib.fa Family names of intact TEs have been updated by genome_list_local.txt.panEDTA.TElib.fa: Cviolacea_585_v2.0.fa.mod.EDTA.intact.gff3 Comparing to the provided library, EDTA found these novel TEs: Cviolacea_585_v2.0.fa.mod.EDTA.TElib.novel.fa The provided library has been incorporated into the final library: Cviolacea_585_v2.0.fa.mod.EDTA.TElib.fa

In the output, both Cviolacea_585_v2.0.fa.mod.EDTA.TElib.fa and genome_list_local.txt.panEDTA.TElib.fa include numerous sequences headed "panTE," but in Cviolacea_585_v2.0.fa.mod.EDTA.intact.gff3 no TEs are annotated with the heading "panTE." Similarly, if I filter Cviolacea_585_v2.0.fa.mod.EDTA.TEanno.gff3 for method=structural, no TEs are annotated as "panTE."

The error log features a long run of repeats this message for each genome:

Unspecified/NA not found in the TE_SO database, it will not be used to rename sequences in the final annotation.

I assume this is where the problem is coming from?

This seems to have happened to all the genomes I included, and appears to be just a problem with updating the names. What information would help you solve this?

Thanks!

Joanna

joannarifkin avatar Oct 23 '23 16:10 joannarifkin