EarlGrey
EarlGrey copied to clipboard
Filtering overlapping repeats for chimers
Hi!
I was looking for a way to filter out overlapping sequences of my RepeatCraft output and tried your filteringOverlappingRepeats.R script.
However, it seems to have an issue with chimeric or nested repeats. In these cases the overlap is either not resolved or the nested repeat gains a Start of the sequence that is after its End of the sequence.
For example: a LTR nested in a TIR is in the rmerge file
contig_1000 RepeatMasker CLASSII/TIR 9374 9777 12.2 + . Tstart=48;Tend=405;ID=EDTA_TE_00001334_inc;shortTE=T
contig_1000 RepeatMasker CLASSI/LTR 9514 9612 25.2 + . Tstart=5136;Tend=5358;ID=RM2_rnd-5_family-4_unconfirmed;shortTE=T
contig_1000 RepeatMasker CLASSII/TIR 9444 9645 12.2 + NA Tstart=48;Tend=405;ID=EDTA_TE_00001334_inc;shortTE=T
contig_1000 RepeatMasker CLASSI/LTR 9646 9612 25.2 + NA Tstart=5136;Tend=5358;ID=RM2_rnd-5_family-4_unconfirmed;shortTE=T
I am not sure what is the easiest way to solve this in the current code as you would need to update the two repeats at the same time...
Cheers