Josh Leverette comments

Results 132 comments of


                                            Josh Leverette

Delays and slowness when using mixtral

@Confuze the `num_gpu` parameter that you set to 10000 was trying to force _more_ layers onto the GPU. Mixtral has 33 layers. You just have to keep lowering that number...

Delays and slowness when using mixtral

@madsamjp With a 4090, you should be able to offload all 33 layers of the 3-bit quantized models and get 50+ tokens per second. If you want to run the...

LargeWorldModel

Part of the appeal of LWM is that it does support video, but I don’t think there’s any way to use it with videos in ollama currently.

LargeWorldModel

I don’t have commit access to this repo, so I can’t reopen this issue, but it might be worth keeping it open for now.

Continue adding spaces to code

I'm seeing the exact same extra-space issue some of the time in the few minutes that I've spent testing the extension in PyCharm, and the extra space before the suggestion...

Allow whitespace within objects and arrays, but remove trailing possibly infinite whitespace

Here is an example of the kind of JSON output I've seen before this PR: ```json { "bullet_points": [ { "text": "Some text here, clipped for brevity of the example"...

Extending deref/index with ownership transfer: DerefMove, IndexMove, IndexSet

Given that we're in the `impl Period`, is there any motion to make forward progress on this? In addition to the heavily discussed stuff, this issue seems to be a...

Models often don't load on versions after 0.1.132

This was probably the main issue for this kind of thing: https://github.com/ollama/ollama/issues/1952#issuecomment-2105376333 I would probably leave a comment there too. Since you're on AMD, it's not actually related to CUDA,...

support for deepseek v2

https://github.com/ollama/ollama/issues/4245 https://github.com/ollama/ollama/issues/4221

phi3 medium small vision

https://github.com/ggerganov/llama.cpp/pull/7225 just merged, so I wonder if it's time to get the 128k models added to the library as well.