frank
frank copied to clipboard
FRANK: Factuality Evaluation Benchmark
Thanks for making this benchmark, it's very useful! The error `RelE` doesn't appear in the paper, while the error `PredE` doesn't appear in the data. I assume it's the same...
Hi Artidoro, I really like this work. Since it seems like you went to the effort of getting many scoring algorithms to run, did you have any plans to provide...
Hi, I appreciate that I found a pretty good benchmark for the summarization metrics. I have several things to ask after going through the codes and paper. As far as...
Hi Again :) I was checking my own implementation of the factCC scoring you described in the paper against your data, and noticed that for 90 cases we derived different...
Can you provide the leaderboard results somewhere?