Marut Pandya issues

Repositories
Issues
Comments

Results 14 issues of


                                            Marut Pandya

[mc] add Env and modelpath util

Note: You can only load one model at a time, Hence in quick deploy only single model is assigned.

fix errors at top level

Add job caching and async operation to rp_scale

[EXPERIMENTAL] Deploy openai/gpt-oss-* on runpod serverless.

I've created a custom Docker image `runpod/worker-v1-vllm:v2.8.0gptoss-cuda12.8.1` that allows us deploy the openai/gpt-oss-* model on RunPod Serverless. This is an experimental release and subject to change as vLLM adds full...