ImmarKarim

Results 1 comments of ImmarKarim

> ``` > VLLM_CPU_OMP_THREADS_BIND="0|1|2|3|4|5|6|7" vllm serve DeepSeek-R1-Distill-Qwen-14B --max-model-len 1024 --dtype bfloat16 --served-model-name DeepSeek -tp 8 --distributed-executor-backend mp > ``` @LBJ6666 can you please share how many output tokens/sec were you...