litellm
litellm copied to clipboard
[Bug]: Proxy is converting /v1/completions endpoint to /v1/chat/completions data structure
What happened?
Proxy is converting /v1/completions endpoint to /v1/chat/completions data structure
payload sent: curl -s --insecure http://0.0.0.0:8000/v1/completions -d '{ "prompt": "def print_hello_world():", "model": "starcoder2-3b"}' -H "Content-Type: application/json"
Using latest git version: a311788f0da7fb052499e14463c18d1c84e6d739
starting litellm with poetry run litellm -c local.yml --port 8000 --detailed_debug
---
model_list:
- model_name: starcoder2-3b
litellm_params:
api_base: https://vllm.example.com/v1
api_key: "os.environ/API_KEY"
model: openai/starcoder2-3b
stream_timeout: 5
model_info:
mode: completion
litellm_settings:
drop_params: true
num_retries: 3
request_timeout: 20
allowed_fails: 3
general_settings:
background_health_checks: true
health_check_interval: 300
Using the vllm OpenAI compatible API directly works smooth:
$ curl -s --insecure https://vllm.example.com/v1/completions -d '{ "prompt": "def print_hello_world():", "model": "starcoder2-3b"}' -H "Content-Type: application/json" | jq
{
"id": "cmpl-00a6b5104c024c71bc0ca4b001946f95",
"object": "text_completion",
"created": 1712766147,
"model": "starcoder2-3b",
"choices": [
{
"index": 0,
"text": "\n print(\"Hello Python\")\n print(\"Hello World\")\n\n\ndef main",
"logprobs": null,
"finish_reason": "length",
"stop_reason": null
}
],
"usage": {
"prompt_tokens": 7,
"total_tokens": 23,
"completion_tokens": 16
}
}
Relevant log output
18:16:37 - LiteLLM:INFO: utils.py:1112 -
POST Request Sent from LiteLLM:
curl -X POST \
https://vllm.example.com/v1/ \
-d '{'model': 'starcoder2-3b', 'messages': [{'role': 'user', 'content': 'def print_hello_world():'}], 'extra_body': {}}'
Twitter / LinkedIn details
No response
Hi @bufferoverflow i believe the fix is behind this PR - https://github.com/BerriAI/litellm/pull/2709
Will update this ticket once it's out
thanks @krrishdholakia I will give that PR a try and report back
setting text-completion-openai within model definition did the trick, thanks @krrishdholakia !
- model_name: starcoder2-3b
litellm_params:
api_base: https://vllm.example.com/v1
api_key: THIS_IS_UNUSED
model: text-completion-openai/starcoder2-3b
stream_timeout: 5
model_info:
mode: completion
metadata: >
StarCoder2 trained with The Stack v2 dataset. More information:
https://huggingface.co/bigcode/starcoder2-3b