mistral.rs
mistral.rs copied to clipboard
Streamed inference not as smooth (fast?) as with e.g. Ollama - Llama 3.1
Describe the bug
Have a look :-)
https://github.com/user-attachments/assets/321dbb21-2403-4330-9ce1-091902298888
Latest commit or version
0.22 MBP M3 Max