phalexo comments

Results 137 comments of


                                            phalexo

Out of memory error on model that previously worked fine after update to version 0.1.13

Did you to say 0.1.13 works? It is the v0.1.11 that works for me. I copied gguf folder from v0.1.11 to v0.1.12, recompiled and it made v0.1.12 work. Between 11...

Out of memory error on model that previously worked fine after update to version 0.1.13

@madsamjp, tried it unsuccessfully with the next version up, v0.1.16, v0.1.15 cannot possibly work. On Fri, Dec 15, 2023 at 4:17 PM Igor Schlumberger ***@***.***> wrote: > @phalexo Ollama 0.1.15...

Out of memory error on model that previously worked fine after update to version 0.1.13

```bash git clone --recursive https://github.com/jmorganca/ollama.git cd ollama/llm/llama.cpp vi generate_linux.go ``` ```go //go:generate cmake -S ggml -B ggml/build/cuda -DLLAMA_CUBLAS=on -DLLAMA_ACCELERATE=on -DLLAMA_K_QUANTS=on -DLLAMA_CUDA_FORCE_MMQ=on //go:generate cmake --build ggml/build/cuda --target server --config Release //go:generate...

Out of memory error on model that previously worked fine after update to version 0.1.13

How is the performance though? Is it impacted by the change? On Sat, Dec 16, 2023, 6:11 AM madsamjp ***@***.***> wrote: > @phalexo this works! It seems that adding >...

Out of memory error on model that previously worked fine after update to version 0.1.13

Did you ever test for performance with MMQ flag versus the 0.1.11? On Wed, Jan 3, 2024, 1:11 PM madsamjp ***@***.***> wrote: > @technovangelist I've updated to the > latest...

Out of memory error on model that previously worked fine after update to version 0.1.13

All my testing is ad hoc, difficult to assess. I thought you run a largish system so it may be noticeable. I have a suspicion that there may be a...

###### problem

I see a very similar behavior running on GPUs, VRAM is less than 50%. Besides #### I see page scrolling pretty fast. This is the response to the first query....

###### problem

Check the prompt format for this model. I think I've seen this when I failed to use the correct prompt format. On Mon, Jan 8, 2024, 9:01 AM simplesisu ***@***.***>...

Better reports "Out of memory"

> Lot of user don't understand they are facing a memory error. It could be nice to explain in the error message that it is a memory error. > >...

Better reports "Out of memory"

The most likely bug is that one of the specialized matrix/matrix multiply kernels is leaking memory. On Fri, Jan 5, 2024, 1:01 PM Igor Schlumberger ***@***.***> wrote: > @jukofyork I'm...