Yuming Pan

Results 3 issues of Yuming Pan

I have AMD EPYC 9654 and it has 96 cores 192 threads. When running llama.cpp /main with Yi-34b-chat Q4, the peek inferencing speed tops at around 60 threads. Setting more...

bug-unconfirmed

### What is the issue? Has anyone recently deployed ollama on Ubuntu? I've noticed that no matter which model I use, including qwen, deepseek, and phi4 (fp16 full model), if...

bug