CodeGPT Response in Chat isn't displayed during generation until it's finished

What happened?

When using this plugin, model's response in the Chat tab displays empty response until AI completes the entire message, unlike any other chat clients where each new incoming token is immediately displayed. With slower local models (Ollama), in practice this makes plugin's chat unusable.

Relevant log output or stack trace

No response

Steps to reproduce

No response

CodeGPT version

2.9.0-241.1

Operating System

Windows

Aug 07 '24 19:08 scscgit

How are you connecting to Ollama? It sounds like the stream request parameter is set to false.

Aug 09 '24 14:08 carlrobertoh

I don't see any settings

Aug 09 '24 14:08 scscgit

Hmm, if you're connecting via Ollama provider, then this parameter isn't configurable. I'm unable to reproduce this issue, I've tried multiple models, including the Llama 3.1 8b with the most recent version of Ollama.

How did you ensure that ChatGPT waits for the entire response before rendering it on the screen?

Aug 09 '24 15:08 carlrobertoh

It has an interesting behavior, I tried a prompt write numbers from 1 to 1000 and it got stuck as usual, but after a couple of minutes around number 500 it suddenly wrote everything and started slowly continuing token by token (before getting stuck around 650). Though I'm unable to reproduce it - right now at the very moment when CodeGPT starts showing any output, Ollama gets unlocked, meaning my other client starts generating its new response after being blocked, so it's definitely not streaming. Are there any logs for the requests?

Aug 09 '24 16:08 scscgit

I am getting the same issue here: Just install the plugin from the marketplace, and set backend to Ollama, chat and you see the bug.

Sep 01 '24 18:09 AdemJensen

I am getting the same issue here: Just install the plugin from the marketplace, and set backend to Ollama, chat and you see the bug.

And if you set Provider to "Custom OpenAI" and configure ollama URL & model manually, it will be fine, so it seems that this is an issue only for Provider "Ollama"

Sep 01 '24 18:09 AdemJensen

oke, I think I found the issue. Their POST /api/chat API streaming seems to be broken. I will change the underlying API to use the /v1/chat/completions endpoint instead.

Sep 01 '24 21:09 carlrobertoh

Hi @carlrobertoh ... This is not resolved for me, using version 2.11.7-241.1 in PyCharm 2024.2.1 (Professional Edition).

ollama version is 0.3.13

It is still using /api/chat

Adding more to this, /api/chat streams just fine by default, tested with:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.1",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    }
  ]
}'

Oct 19 '24 06:10 lingfish