frob comments

Results 840 comments of


                                            frob

Model keeps running forever

You can set `num_predict` as a parameter in a copy of the model: ``` $ ollama show --modelfile codellama:7b-code | sed -e 's/^FROM.*/FROM codellama:7b-code/' > Modelfile $ echo "PARAMETER num_predict...

Unable to Import Config from JSON File after reinstalling

Output of: ``` uname -a command -v ollama ls -l /usr/local/bin/ollama file /usr/local/bin/ollama ldd /usr/local/bin/ollama ```

Unable to Import Config from JSON File after reinstalling

FWIW, I was able to build ollama in a void container (CPU only). ```console $ docker pull ghcr.io/void-linux/void-musl-full $ docker run --rm -it --name void --entrypoint sh ghcr.io/void-linux/void-musl-full # xbps-install...

Embedding discrepancies vs SentenceTransformers and between Ollama Versions

I have no input to the rest of your document, but `/api/embeddings` [is deprecated](https://github.com/ollama/ollama/blob/main/docs/api.md#:~:text=Note%3A%20this%20endpoint%20has%20been%20superseded%20by%20/api/embed).

Hallucination fix?

How are you querying the model? `ollama run` or some other client?

Hallucination fix?

The ollama server doesn't store any state. If you are finding that the server is responding as if it is, then that would seem to be a problem with the...

Ollama Pull Error

``` Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output options ``` We can't see the actual error message. But chances are...

Error creating the manifest

48M is really small for a llama-based model. Where did you get the file from? It might be an adapter, in which case you need to include the base model...

How to offload all layers to GPU?

``` time=2025-02-18T02:29:51.627Z level=INFO source=server.go:130 msg=offload library=cuda layers.requested=-1 layers.model=81 layers.offload=44 layers.split=11,11,11,11 memory.available="[22.0 GiB 22.0 GiB 22.0 GiB 22.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="127.4 GiB" memory.required.partial="87.1 GiB" memory.required.kv="19.5 GiB" memory.required.allocations="[21.8 GiB 21.8 GiB...

How to hide or shorten DeepSeek 32B's reasoning between <think> and </think>?

https://github.com/ollama/ollama/issues/8875