sekh77
Results
3
comments of
sekh77
@youkaichao - Is this change now available in version 0.6.2? I have a requirement to load LLaMA 3.2 90B vision model across four GPUs spread across two nodes using pipeline...
@DarkLight1337 - Is PP supported for DataBricks DBRX model - databricks/dbrx-instruct?
Command that I'm using to load the model: vllm serve meta-llama/Llama-3.2-90B-Vision-Instruct --enforce-eager --max-num-seqs 16 --tensor-parallel-size 4