DocRed
DocRed copied to clipboard
Question about the value of 'not NA acc'.
Dear authors: Thanks for your implementation with BERT on the DocRED. I have a question that the value of 'not NA acc' is quite large when training, and when the model converges, it even approaches 1. But the test F1 is more normal with a number about 0.54. Beyond that, I find that the value of original implementation (ACL-19) with LSTM seems in line with the final test F1. Thus I want to know why the 'not NA acc' and 'test F1' are so different in training. Looking for your reply!
Thanks for pointing that out! We think this is caused by overfitting. ‘Not NA acc’ is computed on the training data, while ‘test F1’ is computed on the development data. It looks like the BERT model is overfitting the training data and get nearly 100% accuracy.