pyTsetlinMachine icon indicating copy to clipboard operation
pyTsetlinMachine copied to clipboard

how to do hyper-parameter search

Open ruilinchen opened this issue 3 years ago • 1 comments

Thank you for making all the code available! I'm testing the RegressionTsetlinMachine module with my own data, and I am just wondering if there is any guideline concerning hyper-parameter search.

My features can be converted into series of 20,000 bits and my output is continuous and has about 2,000 observations. It took me about 3min to run an experiment on my local machine when I set (number_of_clauses=1000, T=5000, s=2.75), so I cannot afford to run too many experiments to optimize these values, but there just seems to be so many different combinations that I felt a little overwhelmed. . Therefore, I would really appreciate it if the team can provide some insights on how I might optimize my hyper-parameter search space for (number_of_clauses, T, s) given the data that I have.

ruilinchen avatar Oct 19 '20 01:10 ruilinchen

Hi @ruilinchen! Thanks for your comment. One trick that may help is that when you find a ratio between number_of_clauses and T that works well, you can keep that ratio fixed and adjust both parameters together, e.g., doubling/halving both. When it comes to the s-parameter, values in the range 2.0 - 20 often work well. To avoid adjusting s, you can try out multi-granular clauses by e.g setting s=2.0 and s_range=20.0. Then the clauses will have different s-values, from 2.0 to 20.0. This will bring performance close to the best s-value. However, to maximize performance, finding the best s-value works best. Hope these guidelines help finding good hyper parameters!

olegranmo avatar Feb 08 '21 17:02 olegranmo