FastChat
FastChat copied to clipboard
fastchat.serve.openai_api_server doesn't work with `stream=true` parameter
I'm trying to get FastChat working on https://github.com/open-webui/open-webui via it's option to accept OpenAI compatible endpoints. TabbyAPI seems to be able to serve OpenAI API format with streaming/chunking. FastChat doesn't seem to support.
I've install Exllamav2 and confirmed API works well
python3 -m fastchat.serve.controller --host 0.0.0.0 --port 21001
python3 -m fastchat.serve.model_worker --model-path ~/model/Llama-3-8B-Instruct-262k-8.0bpw-h6-exl2 --enable-exllama
python3 -m fastchat.serve.openai_api_server --host 0.0.0.0 --port 8000
Works fine w/o "stream": true
Issue
Errors with "stream": true
I also encountered this problem, have you solved this?
fastchat streams output tokens on another endpoint/module. Hoping it was in roadmap to port to fastchat.serve.openai_api_server
Hi everyone! I have the same problem. Has anyone found a local solution?