CodeRL
CodeRL copied to clipboard
problems in the critic model results
Hello, I noticed that you have trained a four classification model (Critic). what are the accuracy, recall, f1_score of the classification model in APPS testset. how to determine whether the critic model is ready?