Josh Leverette
Josh Leverette
### What is the issue? [This PR](https://github.com/ggerganov/llama.cpp/pull/6920) just merged on llama.cpp, which contained important improvements to how tokenization worked for Llama 3 and other models. An example of the issue...
This is a little something I worked up (with some help :robot:) to make my life easier as a `fish` user: `~/.config/fish/completions/ollama.fish` ```fish function __ollama_list set -l query (string join...
### What is the issue? I downloaded the wrong model, ran it, realized my mistake, then deleted it, and noticed it was still listed as being present in VRAM according...
According to [this Refact blog post](https://refact.ai/blog/2023/self-hosted-15b-code-model/): > Check out the [docs on self-hosting](https://github.com/smallcloudai/refact-self-hosting) to get your AI code assistant up and running. > To run StarCoder using 4-bit quantization, you’ll...
I have seen some other talk of memory leaks (#390), but I'm having a more sporadic, shorter term issue. I've experienced this on both an RTX 4070 with 12GB VRAM...
Since Phi-3 mini did so well on the leaderboard, it would be interesting to see where the new small and medium models land. With Phi-3 vision, it also seems like...
### Before submitting your bug report - [X] I believe this is a bug. I'll try to join the [Continue Discord](https://discord.gg/NWtdYexhMs) for questions - [X] I'm not able to find...