Richard Ginsberg
Richard Ginsberg
> This should be resolved by #3218 Just tested v0.1.30, the issue is still present. 
Above I confirmed the issue persists in v0.1.30. To confirm is wasn't new from v0.1.30, I tried in v0.1.29. Same issue. `docker run -d --gpus=all -v /home/username/ollama:/root/.ollama -p 11434:11434 --name...
fastchat streams output tokens on another endpoint/module. Hoping it was in roadmap to port to fastchat.serve.openai_api_server