HammondWen

Results 2 comments of HammondWen

@yuqie HI, you can try like this: vllm serve /mnt/model --tensor-parallel-size 8 --pipeline-parallel-size 1 --trust-remote-code \\ --served-model-name your_model_name python3 ./vllm/benchmarks/benchmark_serving.py \\ --model /mnt/model/ \\ --dataset-name sharegpt \\ --dataset-path /mnt/ShareGPT_V3_unfiltered_cleaned_split.json.1 \\...

![Image](https://github.com/user-attachments/assets/b151c102-3021-43ee-a4fe-a7d6b1bda655)