simon gao
simon gao
增加了api_stream.py,实现了将stream_chat同步调用以异步方式提供给http请求。stream_chat的结果存放在定时清理的缓存中,客户端通过多次发送post请求从缓存中获取response,直到遇到停止词[stop]时停止请求。调用方式如下: curl -X POST "http://127.0.0.1:8000/stream" \ -H 'Content-Type: application/json' \ -d '{"prompt": "你好", "history": []}'
Add stream api(api_stream.py),using dict cache stream response from worker,client send multiple requests get result from cache. start stream api server : ``` # kill process batch pkill -9 -f fastchat...
pip install tls_client or add tls_client to requirements.txt