mistral.rs icon indicating copy to clipboard operation
mistral.rs copied to clipboard

Blazingly fast LLM inference.

Results 186 mistral.rs issues
Sort by recently updated
recently updated
newest added

## Describe the bug Running Llama 3.2 on my MacBook Pro M3 Max (128GB) - with ``` cargo run --release --features metal -- --port 1234 vision-plain -m lamm-mit/Cephalo-L lama-3.2-11B-Vision-Instruct-128k -a...

bug
triaged

## Describe the bug I got error directory: `/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs/examples` mymachine environment ``` ProductName: macOS ProductVersion: 14.4.1 Hardware Overview: Model Name: MacBook Pro Model Identifier: MacBookPro18,4 Model Number: Z15H0016ZJ/A Chip: Apple...

bug

https://youtu.be/gtcOncFLMeo

new feature

This occurs when using two GPUs, but it does not occur when I use just the one. I made sure to update to the docker image used in the dockerfile....

How to deploy mistralrs on Android for large model inference?

new feature

no pre-built `mistralrs-server` binaries under assets in [0.3.0 github release](https://github.com/EricLBuehler/mistral.rs/releases/tag/v0.3.0) very sad 😢

bug

I use stop words/sequences to determine next steps after a response. So if the LLM returns stop word A, we perform action X. I also do the same with other...

new feature

Hello, llama.cpp recently added support for an AArch64 specific type of GGUF and AArch64 specific matmul kernels. Here is the merged PR https://github.com/ggerganov/llama.cpp/pull/5780#pullrequestreview-21657544660 Namely Q4_0_8_8, Q4_0_4_8 and more generic Q4_0_4_4...

new feature