arena-hard和aplace_eval数据集review阶段卡住
用的最新的main分支,judge_worker_num为1。
麻烦提供下运行的命令和log
要卡很久才会报出来
用的命令是:
evalscope eval --api-url http://0.0.0.0:8802/v1 --model /workspace/model_zoo/DeepSeek-R1-tokenizer --api-key EMPTY --eval-type service --datasets alpaca_eval --dataset-args '{"alpaca_eval": {"few_shot_num": 0, "filters": {"remove_until":"</think>"}}}' --eval-batch-size 128 --generation-config max_tokens=32768,temperature=0.0 --timeout 36000 --judge-strategy llm_recall --judge-model-args '{"api_url":"http://0.0.0.0:8802/v1/chat/completions","model_id":"/workspace/model_zoo/DeepSeek-R1-tokenizer"}'
报的错是
麻烦提供下运行的命令和log
已提供
请问,尝试更换judge model后还会卡着吗
Thank you for your feedback! We will close this issue now. If you have any further questions, please feel free to reopen it. If EvalScope has been helpful to you, please consider giving us a STAR to show your support. Thank you!