FastChat
FastChat copied to clipboard
What's the version of gpt-4 of repo-provided ref answers
Hi!
I'm using the default configuration in llm_judge repo. But when I use openai apis from different mirror, I got significantly different result. I use llama3-8b, and the score from two api-providers are 8.038760 and 6.827044.
And that result raises a question: What is the gpt-4 version at reference_answer/gpt-4.jsonl when the repo releases?
Same problem. I also got different scores from two api-providers on the same inference result generated by MiniCPM-2B-DPO-BF16. One of them is 7.090625, and the other is 6.025. Did you find the reason?
Same problem. I also got different scores from two api-providers on the same inference result generated by MiniCPM-2B-DPO-BF16. One of them is 7.090625, and the other is 6.025. Did you find the reason?
No, but I guess the different api-providers offer different versions of models. It has nothing to do with FastChat.