ULTRA icon indicating copy to clipboard operation
ULTRA copied to clipboard

tuning ULTRA to find large unit repeats

Open svedwards opened this issue 2 months ago • 1 comments

Hi again Daniel -

I am doing a little comparison among RepeatMasker, Satellite Repeat Finder and Ultra for annotating some interesting VNTRs that the first two tools missed but which were detected by ULTRA. Similarly, SRF finds some satellites with very long repeat units, on the order of 18 kb, but in my runs of ULTRA with default parameters (using the -- tune option), it doesn't find such satellites. I'm curious what the dynamic range of ULTRA might be in terms of unit repeat length and how I can further push it to find satellites with larger repeat units. -- Scott

svedwards avatar Oct 25 '25 23:10 svedwards

Hi Scott,

By default, ULTRA uses a maximum detectable repeat period of 100, although this can be changed using -p . Increasing the maximum period will increase your total runtime, but it will allow you to find those satellites missed by ULTRA with default settings. I suggest trying these settings with a maximum period of 500, 1000, or 2000:

ultra -p <max period> -i 3 -d 3 -t <num threads>

-i 3 and -d 3 reduces the number of insertion and deletion states in ULTRA's model respectively; in my testing these settings significantly improve speed and provide only a very small reduction in sensitivity. I suspect that the 18 kb period repeat that you have found is some sort of higher order repeat (HOR) that is a hierarchical composition of smaller tandem repeats. In this sort of case ULTRA should be able to fully (or nearly fully) annotate the repeat using a much smaller max repeat period.

Thanks for reaching out - let me know if the higher period ULTRA parameters improves your annotation quality.

Daniel

DanielOlson avatar Oct 26 '25 15:10 DanielOlson