LMFlow
LMFlow copied to clipboard
How to evaluate RAFT?
请问repo中提供了raft fine-tuning后测试模型效果的代码吗?比如在imdb测试集上计算reward或perplexity
Thanks for the feedback!
We should add these features in the next update soon. For your current need, you may see the main loop of raft in ~line442 of src/lmflow/pipeline/raft_aligner.py and evaluate the model at each iteration.
For the perplexity, you can check the example provided by the hugging face https://huggingface.co/docs/transformers/perplexity , and for the reward, you can check the function _get_batch_dataset_top of raft_aligner.py.
Also for a little bit of my personal experience, for most of the time, the reward function is far from perfect. For instance, the reward model we trained on imdb dataset only achieve a 92% accuracy in the test set and also we have trained a reward model for the hh-rlhf dataset, with only ~80% accuracy. So early stopping is important to avoid the model from losing too much fluency and diversity.
This issue has been marked as stale because it has not had recent activity. If you think this still needs to be addressed please feel free to reopen this issue. Thanks