Fomenko

Results 5 issues of Fomenko

### Following Problem: When I run "ollama run Mistral" the GPU is constantly running at 100% and consuming 100 watt But the Chat is working fine, without any Problems. **The...

bug-unconfirmed

### What is the issue? # Bug Report ## Description If I use any LLava models ([LLava-Phi-3](https://ollama.com/library/llava-phi3)), the custom System prompt ist working fine. But if I upload an Picture...

bug

# Bug Report ## Description If I use any LLava models ([LLava-Phi-3](https://ollama.com/library/llava-phi3)), the custom System prompt ist working fine. But if I upload an Picture at the start of the...

is it somehow possible to run it like ollama in parallel. like OLLAMA_NUM_PARALLEL=8 OLLAMA_MAX_QUEUE=2048 So multiple chats can be executed at the same time or have an Queue?

Hello, I have following problem. I'm encountering a performance bottleneck in my worker cluster. I currently have 4 workers distributed across multiple machines, but one worker is significantly slower than...