llama-recipes
llama-recipes copied to clipboard
How to evaluate the summarization model performace using Rouge score
I was able to replicate the quick start notebook. But I am not sure how to evaluate the fine tuned model's performance. Is there an embedded method for evaluation?
@hxue3 it has not been done in the notebook but you should be able to use the same function get_preprocessed_dataset(tokenizer, samsum_dataset, 'train')
for validation
and test
and passing it as eval_dataset
to the trainer.
Hi! It seems that a solution has been provided to the issue and there has not been a follow-up conversation for a long time. I will close this issue for now and feel free to reopen it if you have any questions!