docs: Add support matrix for model parallelism in OpenAI Frontend

Open rmccorm4 opened this issue 1 year ago • 0 comments

Add support matrix (and known limitations) around multi-gpu models for vLLM/TRTLLM

Oct 17 '24 17:10 rmccorm4