Ola模型自测与OpenCompass榜单差距很大

Open bobo0810 opened this issue 10 months ago • 4 comments

基于VLMEvalKit评测，裁判模型gpt-4-turbo

Mar 03 '25 09:03 bobo0810

https://github.com/Ola-Omni/Ola/issues/14

Mar 03 '25 09:03 bobo0810

@dongyh20 Could you please assist in resolving this question?

Mar 04 '25 02:03 PhoenixZ810

I will check this asap

Mar 04 '25 13:03 dongyh20

We have reported the results on our own machine. The results have a slight difference with the official benchmark, but it is acceptable and it may caused by different machines or different envs. #https://github.com/Ola-Omni/Ola/issues/14

Mar 05 '25 12:03 dongyh20