Dan
Dan
#self-assign
#self-assign
#self-assign
hihi I also have same concern wiht @TonyZhanghm, seems in instructGPT ect paper they use the pairwise loss when training the reward model. I just found out this reward model...
We have updated our github https://github.com/HLTCHKUST/MulQG and you can check the newest version of codes and also a more detailed README. For your specific questions: A1: The answerability-metric is from...