frob

Results 840 comments of frob

You can set `num_predict` as a parameter in a copy of the model: ``` $ ollama show --modelfile codellama:7b-code | sed -e 's/^FROM.*/FROM codellama:7b-code/' > Modelfile $ echo "PARAMETER num_predict...

Output of: ``` uname -a command -v ollama ls -l /usr/local/bin/ollama file /usr/local/bin/ollama ldd /usr/local/bin/ollama ```

FWIW, I was able to build ollama in a void container (CPU only). ```console $ docker pull ghcr.io/void-linux/void-musl-full $ docker run --rm -it --name void --entrypoint sh ghcr.io/void-linux/void-musl-full # xbps-install...

I have no input to the rest of your document, but `/api/embeddings` [is deprecated](https://github.com/ollama/ollama/blob/main/docs/api.md#:~:text=Note%3A%20this%20endpoint%20has%20been%20superseded%20by%20/api/embed).

How are you querying the model? `ollama run` or some other client?

The ollama server doesn't store any state. If you are finding that the server is responding as if it is, then that would seem to be a problem with the...

``` Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output options ``` We can't see the actual error message. But chances are...

48M is really small for a llama-based model. Where did you get the file from? It might be an adapter, in which case you need to include the base model...

``` time=2025-02-18T02:29:51.627Z level=INFO source=server.go:130 msg=offload library=cuda layers.requested=-1 layers.model=81 layers.offload=44 layers.split=11,11,11,11 memory.available="[22.0 GiB 22.0 GiB 22.0 GiB 22.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="127.4 GiB" memory.required.partial="87.1 GiB" memory.required.kv="19.5 GiB" memory.required.allocations="[21.8 GiB 21.8 GiB...