mistral.rs
mistral.rs copied to clipboard

Published 20 hours ago •

Reame
Issues

Streamed inference not as smooth (fast?) as with e.g. Ollama - Llama 3.1

Open ChristianWeyer opened this issue 7 months ago • 36 comments

Describe the bug

Have a look :-)

https://github.com/user-attachments/assets/321dbb21-2403-4330-9ce1-091902298888

Latest commit or version

0.22 MBP M3 Max

Jul 25 '24 16:07 ChristianWeyer