aider Attempted to access streaming response content, without having called read()

Issue

I'm encountering an issue while trying to run Aider using a local LLM. The error message is as follows:

Aider v0.44.0
Model: ollama/llama3:70b with diff edit format
Git repo: .git with 15 files
Repo-map: using 1024 tokens
Unexpected error: Attempted to access streaming response content, without having called `read()`.
Traceback (most recent call last):
  File "/home/rowbot/.local/lib/python3.10/site-packages/aider/coders/base_coder.py", line 860, in send_new_user_message
    yield from self.send(messages, functions=self.functions)
  File "/home/rowbot/.local/lib/python3.10/site-packages/aider/coders/base_coder.py", line 1116, in send
    yield from self.show_send_output_stream(completion)
  File "/home/rowbot/.local/lib/python3.10/site-packages/aider/coders/base_coder.py", line 1204, in show_send_output_stream
    for chunk in completion:
  File "/home/rowbot/.local/lib/python3.10/site-packages/litellm/llms/ollama.py", line 356, in ollama_completion_stream
    raise e
  File "/home/rowbot/.local/lib/python3.10/site-packages/litellm/llms/ollama.py", line 315, in ollama_completion_stream
    status_code=response.status_code, message=response.text
  File "/usr/local/lib/python3.10/dist-packages/httpx/_models.py", line 576, in text
    content = self.content
  File "/usr/local/lib/python3.10/dist-packages/httpx/_models.py", line 570, in content
    raise ResponseNotRead()
httpx.ResponseNotRead: Attempted to access streaming response content, without having called `read()`.

Version and model info

Aider v0.44.0 Model: ollama/llama3:70b with diff edit format Git repo: .git with 15 files Repo-map: using 1024 tokens

Jul 18 '24 14:07 rowbot1

I think i got it working. I am using llama and i edited the base_coder.py file and replaced the def show_send_output_stream(self, completion) function to:

def show_send_output_stream(self, completion):
        for chunk in completion:
            if isinstance(chunk, dict):
                # Handle dictionary response (from Ollama)
                if 'response' in chunk:
                    yield chunk['response']
                continue
            
            # Original handling for object with 'choices' attribute
            if not hasattr(chunk, 'choices') or len(chunk.choices) == 0:
                continue
            
            content = chunk.choices[0].delta.content
            if content:
                yield content

Jul 18 '24 16:07 rowbot1

Thanks for trying aider and filing this issue.

Sorry, aider works with ollama. Do you have any idea why your ollama is returning non-standard responses?

Jul 22 '24 07:07 paul-gauthier

Hi @paul-gauthier. Getting the same:

Unexpected error: Attempted to access streaming response content, without having called `read()`.
Traceback (most recent call last):
  File "/home/tyeress/venv_aider/lib/python3.9/site-packages/aider/coders/base_coder.py", line 858, in send_new_user_message
    yield from self.send(messages, functions=self.functions)
  File "/home/tyeress/venv_aider/lib/python3.9/site-packages/aider/coders/base_coder.py", line 1116, in send
    yield from self.show_send_output_stream(completion)
  File "/home/tyeress/venv_aider/lib/python3.9/site-packages/aider/coders/base_coder.py", line 1204, in show_send_output_stream
    for chunk in completion:
  File "/home/tyeress/venv_aider/lib/python3.9/site-packages/litellm/llms/ollama.py", line 356, in ollama_completion_stream
    raise e
  File "/home/tyeress/venv_aider/lib/python3.9/site-packages/litellm/llms/ollama.py", line 315, in ollama_completion_stream
    status_code=response.status_code, message=response.text
  File "/home/tyeress/venv_aider/lib/python3.9/site-packages/httpx/_models.py", line 576, in text
    content = self.content
  File "/home/tyeress/venv_aider/lib/python3.9/site-packages/httpx/_models.py", line 570, in content
    raise ResponseNotRead()
httpx.ResponseNotRead: Attempted to access streaming response content, without having called `read()`.

Running inside WSL:

curl -fsSL https://ollama.com/install.sh | sh
# TEST
ollama run deepseek-coder-v2:16b
# RUN
ollama serve

python3 -m venv venv_aider
source venv_aider/bin/activate
python -m pip install aider-chat
mkdir myproject
cd myproject
git init .
export OLLAMA_API_BASE=http://127.0.0.1:11434
aider --model ollama/deepseek-coder-v2
# AFTER FIRST PROMPT, I'M GETTING ABOVE ERROR

Using:

Aider v0.45.1                                                                                                                                                                              Model: ollama/deepseek-coder-v2 with whole edit format

ollama version is 0.2.8

Jul 24 '24 13:07 atkalcec

same here ollama run eramax/nxcode-cq-7b-orpo:q6

aider --model ollama/nxcode-cq-7b-orpo:q6

aider 0.45.1 ollama version is 0.2.8

Jul 25 '24 15:07 cyysky

same here ollama run eramax/nxcode-cq-7b-orpo:q6

aider --model ollama/nxcode-cq-7b-orpo:q6

aider 0.45.1 ollama version is 0.2.8

solved with aider --model ollama/eramax/nxcode-cq-7b-orpo:q6

maybe can have some indicator if model not found.

Jul 25 '24 15:07 cyysky

Indeed. Monitoring ollama serve shows 404, which can be returned with POST /api/generate when the model doesn't exist. I had to explicitly use: aider --model ollama/deepseek-coder-v2:16b, and then it works.

Jul 25 '24 22:07 atkalcec

It may be that ollama doesn't support streaming for the model you are using? You could run aider with --no-stream and see if that helps?

Jul 30 '24 18:07 paul-gauthier

i have the same problem and cant fix it

Jul 30 '24 20:07 gejignas

This appears to have been a bug in litellm, which they have resolved.

https://github.com/BerriAI/litellm/issues/4974

I'm just waiting for 1.42.6 to publish and then I will upgrade aider to use it.

Jul 31 '24 13:07 paul-gauthier

just to add a note, if you use aider --model ollama/llama3.1:8b it will work

Jul 31 '24 19:07 cccarv82

Hi, same problem here.

ollama version is 0.3.2 Aider v0.47.1 aider --model ollama/llama3.1

Unexpected error: Attempted to access streaming response content, without having called `read()`.
Traceback (most recent call last):
  File "/home/loicngr/.local/lib/python3.10/site-packages/aider/coders/base_coder.py", line 865, in send_new_user_message
    yield from self.send(messages, functions=self.functions)
  File "/home/loicngr/.local/lib/python3.10/site-packages/aider/coders/base_coder.py", line 1126, in send
    yield from self.show_send_output_stream(completion)
  File "/home/loicngr/.local/lib/python3.10/site-packages/aider/coders/base_coder.py", line 1200, in show_send_output_stream
    for chunk in completion:
  File "/home/loicngr/.local/lib/python3.10/site-packages/litellm/llms/ollama.py", line 370, in ollama_completion_stream
    raise e
  File "/home/loicngr/.local/lib/python3.10/site-packages/litellm/llms/ollama.py", line 329, in ollama_completion_stream
    status_code=response.status_code, message=response.text
  File "/home/loicngr/.local/lib/python3.10/site-packages/httpx/_models.py", line 576, in text
    content = self.content
  File "/home/loicngr/.local/lib/python3.10/site-packages/httpx/_models.py", line 570, in content
    raise ResponseNotRead()
httpx.ResponseNotRead: Attempted to access streaming response content, without having called `read()`.

Aug 02 '24 06:08 loicngr

This appears to have been a bug in litellm, which they have resolved.

BerriAI/litellm#4974

I'm just waiting for 1.42.6 to publish and then I will upgrade aider to use it.

Hey, when can we expect the fix? :)

Aug 02 '24 09:08 ppulwey

I just bumped the litellm version that aider uses, which should fix this.

The change is available in the main branch. You can get it by installing the latest version from github:

python -m pip install --upgrade git+https://github.com/paul-gauthier/aider.git

If you have a chance to try it, let me know if it works better for you.

Aug 02 '24 09:08 paul-gauthier

Its still the same for me on the latest version. aider:

`> hi



Tokens: 685 sent, 0 received.
21:10:24 - LiteLLM:WARNING: litellm_logging.py:1298 - Model=llama3.1:8b not found in completion cost map. Setting 'response_cost' to None

ollama terminal:

time=2024-08-02T21:10:24.244+03:00 level=INFO source=server.go:617 msg="llama runner started in 7.71 seconds"
[GIN] 2024/08/02 - 21:10:24 | 200 |    8.3906176s |       127.0.0.1 | POST     "/api/generate"
[GIN] 2024/08/02 - 21:10:43 | 200 |    483.3075ms |       127.0.0.1 | POST     "/api/generate"

Aug 02 '24 18:08 ClaudiuHNS

@ClaudiuHNS can you show me the announce lines that aider prints? Did you install the main branch from github?

I can no longer reproduce this "not found in completion cost map " warning using Aider v0.47.2-dev.

Aug 02 '24 18:08 paul-gauthier

Yes, I used Aider v0.47.2-dev. Found the issue. The problem I had only reproduces when using llama3.1:8b. With llama3.1 works fine (which is still 8b, I guess). Like this: T1: ollama pull llama3.1:8b ollama serve T2: aider --model ollama/llama3.1:8b

Aug 02 '24 18:08 ClaudiuHNS

In case it helps anyone here, I was having this problem and found out that I had a capitalized "B" in aider --model ollama/deepseek-coder-v2:16B. When I changed it to "deepseek-coder-v2:16b" it worked.

Aug 03 '24 00:08 benbarnard-OMI

I'm going to close this issue for now, but feel free to add a comment here and I will re-open or file a new issue any time.

Aug 06 '24 13:08 paul-gauthier

aider aider copied to clipboard

Attempted to access streaming response content, without having called read()

Issue

Version and model info

aider
aider copied to clipboard