FuseAI icon indicating copy to clipboard operation
FuseAI copied to clipboard

about gpt-4-0125-preview reference answer

Open duguodong7 opened this issue 5 months ago • 4 comments

hello,

我想咨询一下在MT-bench上测试时,使用的reference answer 是通过 gen_api_answer.py --model gpt-4-0125-preview这个命令来获取的吗? 生成的reference answer有80个,然后把其中100~130个用official commenthttps://github.com/lm-sys/FastChat/pull/3158这里的正确的30个进行替换吗? 总结一下;judge model 是用gpt-4-0125-preview, 但是80个问题的reference answer 是怎么获取呢,judge的结果是可复现的还是会有波动呢?

duguodong7 avatar Sep 10 '24 02:09 duguodong7