hollama TypeError: Cannot read properties of undefined (reading 'content')

Starting from 0.13.1, while having a conversation with deepseek-coder-v2:16b-lite-instruct-q6_K (maybe with other models too, haven't tested) such an exception gets thrown:

deepseek-coder-v2:16b-lite-instruct-q6_K

Flow to reproduce it on my machine:

Send a big message (Code of razor component, 600+ lines)
Get response from the model
Send another message
Error.

The funny things is that the response can be shown for some time, but then it gets delete and replaced with "Sorry". And I end up with an error message and then the response

Sep 23 '24 13:09 EduardKononov

This error message is indicative of the context window of the model being exceeded.

We made some changes to the way we interact with Ollama a few months ago (way before 0.13.1) which was causing a similar error message to appear.

Unfortunately I'm traveling right now and it's impossible for me to download such a big model on hotel wifi, but I'll try to take a look at it in the next few weeks.

In the meantime, what version of Ollama are you using?

Sep 24 '24 09:09 fmaclen

Thank you for the response! At the time the error appeared, I had been using 0.13.1. Then I updated to the most recent one (0.14.x), but the error did not disappear

Sep 24 '24 20:09 EduardKononov

@EduardKononov sorry, I was asking about your version of Ollama. Right now, the latest is 0.3.11

Sep 25 '24 11:09 fmaclen

Oh, sorry, misread. 0.3.11

Sep 25 '24 11:09 EduardKononov

@EduardKononov I tried this model a bunch, with long conversations and I'm unable to replicate this issue. Is it possible your Ollama server is running out of memory?

Can you provide the specs of the Ollama server environment? You can run npx envinfo --system --npmPackages svelte,rollup,webpack --binaries --browsers.

Oct 19 '24 01:10 fmaclen

I can replicate this on a system with 8GB VRAM + 64GB RAM, so ollama is not running out of memory, here is some more info. I'm using ollama 0.3.12 with CUDA (ollama-cuda) and model is partialy on GPU, partially on CPU (since it does not fit within 8GB of VRAM)

NOTE: I can also reproduce this error while running this model with ollama run, so I don't think that the issue is related to Hollama.

When running with ollama run, followin error occurs after some time of chatting:

Error: an unknown error was encountered while running this model

NAME                    ID              SIZE    PROCESSOR       UNTIL              
deepseek-coder-v2:16b   63fb193b3a9b    10 GB   22%/78% CPU/GPU 2 minutes from now

For debugging purposes I have prepared a following conversation that should trigger this (each line is a new prompt in the same session):

Generate me a curl command that makes a POST request to https://moodle.example.com/login/index.php with from data "username" and "password" and saves cookies output to a file
 
I need to use form data not query params
 
I also need a logintoken from moodle, how can I obtain it programatically with curl
 
What does -d parameter do?
 
Can you use -F instead of -d?
 
What do -oP flags in grep do?
 
What does \K mean in regular expression?
 
Write this using awk and regular expressions capture groups instead of \K

Nov 07 '24 22:11 Kajot-dev