FastChat Add stream api,send post request call the stream api synchronously

Add stream api,send post request call the stream api synchronously

Open little51 opened this issue 1 year ago • 0 comments

Add stream api(api_stream.py),using dict cache stream response from worker,client send multiple requests get result from cache. start stream api server :

# kill process batch
pkill -9 -f fastchat
# start controller
nohup python -u -m fastchat.serve.controller >> fastchat.log  2>&1 &
# start worker
CUDA_VISIBLE_DEVICES=0 nohup python -u -m fastchat.serve.model_worker --model-name 'vicuna-7b-v1.1' --model-path vicuna_data/vicuna-7b-v1.1 >> fastchat.log  2>&1 &
# strat api server
FASTCHAT_CONTROLLER_URL=http://localhost:21001 CUDA_VISIBLE_DEVICES=0 nohup python -u -m fastchat.serve.api_stream --host 0.0.0.0 --port 8000 >> fastchat.log  2>&1 &
# tail log
tail -f  fastchat.log

test:

curl http://localhost:8000/v1/chat/completions/stream   \
-H "Content-Type: application/json"  \
-d '{"model": "vicuna-7b-v1.1","messages": [{"role": "user", "content": "Hello!"}]}'

Run curl at regular intervals until the returned result contains the stopword ([stop])

https://user-images.githubusercontent.com/7981353/236590175-63c76d67-da09-448e-982c-94cf4ac2a721.mp4

https://user-images.githubusercontent.com/7981353/236590182-34c92beb-d89b-43a7-9d4b-a5da4a314e6c.mp4

May 06 '23 01:05 little51

FastChat FastChat copied to clipboard

Add stream api,send post request call the stream api synchronously

FastChat
FastChat copied to clipboard