EDTA icon indicating copy to clipboard operation
EDTA copied to clipboard

PanEDTA Line Detection

Open sjteresi opened this issue 5 months ago • 7 comments

Hello Shujun,

Hope you are doing well. I am writing to share that I had issues with LINE detection in PanEDTA. I am hoping that this will help anyone else who encounters this issue. It is not a bug, just something that I think folks could easily overlook. When I ran PanEDTA (v2.1.0) on its own without pre-calculating results with regular EDTA, it was not finding any LINE elements in my genomes.

After doing some testing, I think it is because the panEDTA script by default calls EDTA.pl without the --sensitive 1 option. The sensitive option calls RepeatModeler. I also observed that when I ran regular EDTA.pl on a genome without the sensitive option, it did not recover any LINEs. So to summarize, it seems that RepeatModeler was doing the heavy lifting for LINE detection in my strawberry genomes, and without it, I wasn't detecting any LINEs. Jordan B, a post-doc in Pat's lab also had this same LINE issue with some Camelina genomes.

In my case, I fixed the issue by running EDTA individually on each genome with the option, and completed the pangenome annotation with panEDTA. That approach worked fine, LINEs were indeed included in my final annotation.

This problem only arises if users decide to use panEDTA to perform all steps of their pangenome annotation. It can easily be sidestepped if user's create the individual annotations with the --sensitive 1 option first.

Sincerely, Scott Teresi

sjteresi avatar Jan 30 '24 17:01 sjteresi