worker-vllm Error: chat template not supported

Am using runpod serverless initially with Deepseek R1 from huggingface, then switched to Llama4-scout for testing but I got this error. No issues on the huggingface side, the model is exactly the same as it was.

Output: { "delayTime": 48454, "error": "{'object': 'error', 'message': 'Chat template does not exist for this model, you must provide a single string input instead of a list of messages', 'type': 'BadRequestError', 'param': None, 'code': 400}", "executionTime": 254, "id": "2bee8688-4f7e-4d8d-b198-3ad3054f34d8-u2", "status": "FAILED", "workerId": "9sxpstx8jawche" }

Switched back to deepseek-r1 and still getting the same message. Nothing was changed but there are processes dependent on this.

Here is the input:

{"input": { "messages": [ {"role": "system", "content": "You are a physics expert."}, {"role": "user", "content": "What is gravity?"} ] }}

The model supports chat tempate.

This is from the HF model sample request. The same request is not working if called through runpod: curl https://router.huggingface.co/v1/chat/completions
-H "Authorization: Bearer $HF_TOKEN"
-H 'Content-Type: application/json'
-d '{ "messages": [ { "role": "user", "content": "What is the capital of France?" } ], "model": "unsloth/DeepSeek-R1-Distill-Llama-8B:featherless-ai", "stream": false }'

Jul 23 '25 15:07 ParthKarth

@ParthKarth Did you try with custom_chat template ?

Jul 25 '25 17:07 pandyamarut

@pandyamarut Tried the custom template.

This is what was provided:

Hi there!

Could you try changing CUSTOM_CHAT_TEMPLATE to {% if system %} {{ system }} {% endif %} {% for message in messages %} {% if message.role == "user" %} <|User|>{{ message.content }} {% elif message.role == "assistant" %} <|Assistant|>{{ message.content }}{% if not loop.last %}<|end▁of▁sentence|>{% endif %} {% endif %} {% endfor %} {% if messages[-1].role != "assistant" %} <|Assistant|> {% endif %}

Here is the output for the same input as before:

Getting blank output: ChatCompletion(id='chatcmpl-d4f077aaa63d42ecb5432219978fc973', choices=[Choice(finish_reason='length', index=0, logprobs=None, message=ChatCompletionMessage(content=' ', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[]), stop_reason=None)], created=1753372866, model='facebook/opt-125m', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=100, prompt_tokens=49, total_tokens=149, completion_tokens_details=None, prompt_tokens_details=None), prompt_logprobs=None)

Jul 25 '25 20:07 ParthKarth

Any update on this?

Jul 28 '25 15:07 ParthKarth