Haodong Duan

Results 226 comments of Haodong Duan

Close the issue due to no response in weeks, plz reopen if needed

Hi, @40459447 , The toolkit is developed for Linux and we are not sure if it runs properly on Windows platform

Hi, @D1m7asis Would you please share me a temporary account on the AI / ML API platform for testing (w. $10 credit, e.g.)? My email is [email protected].

According to my test, the download link of DynaMath is good. Currently we do not provide the access to MMBench_TEST_EN_V11.

Hi, @LIRENDA621 , I have re-evaluated this model (torch2.4+cu121, transformers==4.46.2), and got an accuracy of ~42.3%, which looks inferior to previous evaluation results. However, we are not sure whether it's...

Hi, @hlp2020 Qwen2.5-VL-72B is large so that you can not run one model instance on a single GPU. You can try to launch the evaluation with `python` command: `python run.py...

They are two different repos for lmm evaluations. We provide reference evaluation results (https://huggingface.co/spaces/opencompass/open_vlm_leaderboard, etc.) along with the codebase. You can have a try with both codebases and find the...

主要包括 torch, transformers 的版本,在你的 case 中,两个环境差异在哪些库和版本呢?

> [@kennymckormick](https://github.com/kennymckormick) Hi, @bobo0810 , For MathVista, we adopt gpt-4o-mini as the **judge model**; For MMVet, we adopt gpt-4-turbo-1106 as the **judge model**; For other MCQ benchmarks, we adopt gpt-3.5-turbo-0125...

> [@kennymckormick](https://github.com/kennymckormick) Thank you for your reply. However, OpenAI has already removed the old models. How can we solve this problem? Currently, you can replace the old models with newer...