Marut Pandya

Results 14 issues of Marut Pandya

Note: You can only load one model at a time, Hence in quick deploy only single model is assigned.

I've created a custom Docker image `runpod/worker-v1-vllm:v2.8.0gptoss-cuda12.8.1` that allows us deploy the openai/gpt-oss-* model on RunPod Serverless. This is an experimental release and subject to change as vLLM adds full...