phyml icon indicating copy to clipboard operation
phyml copied to clipboard

How can I determine the optimal number of substitution rate categories to use?

Open liamxg opened this issue 1 year ago • 3 comments

Dear @stephaneguindon,

How can I determine the optimal number of substitution rate categories to use? Thanks.

liamxg avatar Aug 19 '24 23:08 liamxg

The best here is to use the FreeRate model (--freerates option of the command line) combined with the -c X option (with X the number of rate classes). You can then compare a model with X classes to another one with X+1 classes using a likelihood ratio test. The distribution of the likelihood ratio statistic is distributed according to a chi-square distribution with 2 degrees of freedom (although a model with X classes may derive from X+1 classes by either setting two rates to be equal and/or one class frequency to zero, making the chi-square distribution perhaps too conservative).

stephaneguindon avatar Aug 20 '24 08:08 stephaneguindon

Thanks, but could you please tell me how many times should I test to find the best X? @stephaneguindon

liamxg avatar Aug 20 '24 14:08 liamxg

Dear @stephaneguindon,

how to change Compute approximate likelihood ratio test: option in command line?

Best, Liam

liamxg avatar Sep 26 '24 02:09 liamxg

Dear @stephaneguindon,

how to change Compute approximate likelihood ratio test: option in command line?

Best, Liam

i think normally u can simply go with 4 categories for discrete gamma distribution (Yang, 1994) but do have -a e such that the alpha is estimated as MLE. So if you are using gamma (in IQ-Tree +G), then comparing X and X+1 doesn't seem to me to make much sense as both are approx to the continuous Gamma. But as Stephane suggested, if you use --free_rates (IQ-Tree +R), then you may wanna use LRT to determine the best number of free rate classes. As to you last question, you can do a loop: you start from X=1, and stop when LRT doesn't yield significantly different results, then X is what you need. btw, in IQ-Tree i think the default # of freerate classes is 8?

evolbeginner avatar Aug 13 '25 03:08 evolbeginner