senthurRam33
senthurRam33
@berlino While evaluating dev dataset did you guys further cleaned it?? Because I trained the model and tested it for accuracy and it gives me only around 60% at 70000...
  I have trained the BERT model and plotted the loss of my model against the loss of @alexpolozov. Both the loss curves overlaps one another. But the model...
@DevanshChoubey Even I had the same loss initially. Try running the model for about minimum 10000 steps so that the train loss will settle around 2.0 and val loss will...
It was around 60%. But didn't check the model on the newly released spider dataset
@dorajam I have only changed batch size because of memory issues. Remaining hyperparameters have been used as they are. Changed batch size 2x8 (bs x num_batch_accumulated). Try retraining the model...
@alexpolozov @rshin https://github.com/microsoft/rat-sql/issues/7#issuecomment-662271122 In this issue you have added your log file. And the loss in training has been down to 0 but the val loss has stayed around 6....
@ygan Did you train the model with the new Spider dataset??
@alexpolozov Are the hyperparameters deduced using the double descent phenomenon produces the best accuracy?? Or if we play around with theses hyperparameters can we have the chance to gain more...
In enc_dec.py in class EncDecModel and method begin_inference you have a variable enc_state that holds the heatmap data I think. Not so sure. If you plot that you will get...
We also faced the same issue in our training. Our best guess is that it occurred because of gradient explosion. Even when you try to run the model from the...