S. Dale Morrey comments

Results 36 comments of


                                            S. Dale Morrey

trafficstars

Feature Request: Add support for Raspberry Pi Ai Kit

> So what you're telling me is that it's got a 32-bit ARM CPU on it with 2 cores. I doubt there's much advantage offloading to that. Plus having to...

Bug: src/llama.cpp:15099: Deepseek2 does not support K-shift

I'm confused ``` ollama run --verbose deepseek-coder-v2:16b-lite-instruct-q8_0 >>> /show info Model arch deepseek2 parameters 15.7B quantization Q8_0 context length 163840 embedding length 2048 ``` It looks like the training context...

Bug: src/llama.cpp:15099: Deepseek2 does not support K-shift

Ok so I figured this out on my own with a little help from deepseek-coder-v2:16b-lite-instruct-q8_0. The context length reported is the maximum length the model can support even in theory....

Bug: src/llama.cpp:15099: Deepseek2 does not support K-shift

I do this with a parameter over the openai api, just follow the openai docs for the REST API. It works.

support for qwen3-embedding and qwen3-reranker models

> [@rick-github](https://github.com/rick-github) thanks it worked > > Also, one more thing, do you know how I set the `dimensions` `Qwen3-Embedding-8B-GGUF` generates 4096 tokens currently i want only `1024` It's MRL...

support for qwen3-embedding and qwen3-reranker models

> > It's MRL like nomic-embed-text this means you just truncate to the size you want since the most important vectors are first. I'm getting better results on Qwen3-0.6b embed...