sekh77

Results 3 comments of sekh77

@youkaichao - Is this change now available in version 0.6.2? I have a requirement to load LLaMA 3.2 90B vision model across four GPUs spread across two nodes using pipeline...

@DarkLight1337 - Is PP supported for DataBricks DBRX model - databricks/dbrx-instruct?

Command that I'm using to load the model: vllm serve meta-llama/Llama-3.2-90B-Vision-Instruct --enforce-eager --max-num-seqs 16 --tensor-parallel-size 4