ANNOSINE implementation
Hello EDTA team,
I have a highly heterozygous diploid plant and have two haplotype assemblies, I've annotated them with EDTA independent of one another (--anno 1 --cds genes.cds --exclude genes.bed). One haplotype has SINEs (count=34, 7000bp total), and the other has none. The low number and complete absence on other haplotype seems odd.
I recall some issues with ANNOSINE in the past... so I ran ANNOSINEv2 on its own, and then used those files to rerun EDTA_raw.pl with the raw.fa files of respective annotations. In this case, the EDTA rerun eliminated all SINEs, and actually reduced my total TE counts, including eliminating some categories completely (rDNA 45S, pararetrovirus and Endogenous_retrovirus/Bel_Pao). This makes me have broad questions about SINEs.
Since I do not want to report erroneous SINEs that potentially overlap with EDTA results, I am seeking clarity on how SINEs are dealt with. I've attached an image to show the difference: track 1 is ANNOSINEv2.gff3 result, and below is the initial EDTA output result (not the rerun). It appears that LTRs take precedence and can overwrite these SINEs, or perhaps the SINES are poorly annotated to begin with? Only in one case was the SINE retained (TE000003 in middle of plot) and reported as a SINE, with a homology value.
Please let me know your thoughts, or if I am going down a serious rabbit hole that may not be worthwhile.
Thank you, Sam