Marut Pandya
Marut Pandya
Thanks for sharing the feedback. This worker currently supports the serverless. If you want to deploy on pods, it should be straight forward vllm deployment, let me know if I...
https://github.com/runpod-workers/worker-vllm/issues/210 @hoblin @Staberinde . Let me know, if you face any issues. I can take a look. Thanks.
Sure. We can look into this.
I think setting CUSTOM_CHAT_TEMPLATE?
Can you share your request payload?
https://docs.runpod.io/serverless/workers/vllm/get-started, If you will scroll a bit down, you will find some sampling parameters to adjust, please try it with this.
@ParthKarth Did you try with custom_chat template ?