lm-evaluation-harness
lm-evaluation-harness copied to clipboard
Support torchrun vllm DP
Support torchrun external launch mode vllm DP from PR https://github.com/vllm-project/vllm/pull/24899
torchrun --nproc-per-node=8 --no-python lm_eval --model vllm --model_args "pretrained=/data/local/models/oss/DeepSeek-R1-0528,max_model_len=20000,gpu_memory_utilization=0.9,tensor_parallel_size=1,data_parallel_size=8,enable_expert_parallel=true,max_num_seqs=256,distributed_executor_backend=external_launcher" --batch_size 256 --task gsm8k
|Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr|
|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
|gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.956|± |0.0056|
| | |strict-match | 5|exact_match|↑ |0.953|± |0.0058|