eval-scope arena-hard和aplace_eval数据集review阶段卡住

用的最新的main分支，judge_worker_num为1。

May 08 '25 03:05 Bigfishering

麻烦提供下运行的命令和log

May 08 '25 05:05 Yunnglin

要卡很久才会报出来用的命令是: evalscope eval --api-url http://0.0.0.0:8802/v1 --model /workspace/model_zoo/DeepSeek-R1-tokenizer --api-key EMPTY --eval-type service --datasets alpaca_eval --dataset-args '{"alpaca_eval": {"few_shot_num": 0, "filters": {"remove_until":"</think>"}}}' --eval-batch-size 128 --generation-config max_tokens=32768,temperature=0.0 --timeout 36000 --judge-strategy llm_recall --judge-model-args '{"api_url":"http://0.0.0.0:8802/v1/chat/completions","model_id":"/workspace/model_zoo/DeepSeek-R1-tokenizer"}'

报的错是

May 09 '25 03:05 Bigfishering

麻烦提供下运行的命令和log

已提供

May 09 '25 07:05 Bigfishering

请问，尝试更换judge model后还会卡着吗

May 19 '25 09:05 Yunnglin

Thank you for your feedback! We will close this issue now. If you have any further questions, please feel free to reopen it. If EvalScope has been helpful to you, please consider giving us a STAR to show your support. Thank you!

Jul 03 '25 05:07 Yunnglin