RATSQL v2 vs v3?
Hey, I was looking at the leaderboard, and it is not clear what was changed from version 2 to 3.
RATSQL v3 + BERT (DB content used) 69.7 | 65.6
RATSQL v2 + BERT (DB content used) 62.7 | 57.2
Can you elaborate on what changes were made? Thank you.
(Note: you seem to have included the values of RATSQL v2 rather than RATSQL v2 + BERT. The latter stands at 65.8 | 61.9.)
V2 improves upon V1 with (a) value-based linking, (b) BERT-large, (c) hyperparameter tuning. V3 improves upon V2 only via much longer training (90K steps) with a tiny learning rate as we discovered that the double descent phenomenon helps the model slowly gain more accuracy.
@alexpolozov Are the hyperparameters deduced using the double descent phenomenon produces the best accuracy?? Or if we play around with theses hyperparameters can we have the chance to gain more accuracy?