Kaiyang Guo issues

Repositories
Issues
Comments

Results 2 issues of


                                            Kaiyang Guo

GPT4 prompt when evaluating DPO

Thanks for sharing the amazing repo! The GPT-4 win rate prompt stated in the paper is attached below. As HH dataset concerns both helpful and harmless, I wonder why only...

Codes for Evaluating Generative Benchmarks

Thanks for sharing this awesome repo! The paper reports results on MMLU, GSM8K, HumanEval and BigBench-Hard. It seems this repo does not contain the codes for evaluating on these benchmark...