mistral.rs icon indicating copy to clipboard operation
mistral.rs copied to clipboard

Blazingly fast LLM inference.

Results 186 mistral.rs issues
Sort by recently updated
recently updated
newest added

**Describe the bug** Regardless of installing mistralrs-metal (or mistralrs-accelerate) model runs on CPU. This is indicated by log during running and takes exactly the same amount of time as if...

bug

Hello! I had a thought. To minimize constant load for tasks that occur infrequently, is there a way to keep the Docker container running with the HTTP server, but only...

new feature
backend

It would be fantastic if mistral.rs implement an exllamav2 backend to allow loading exl2 models. I know you're planning this, but I saw there wasn't an open feature request to...

new feature
backend
models

https://huggingface.co/openvla/openvla-7b

Currently, AnyMoE only support homogenous expert types. This restricts the user to using only fine-tuned or only LoRA adapter experts. Implementing heterogeneous expert support will enable, for example, mixing fine-tuned...

new feature
models

Currently, the tight loop in `Engine` causes very high single core CPU usage when idle. This is also not great because this is long-running blocking code running inside of an...

## Minimum reproducible example `cargo build --release --features metal` ## Error Compiling candle-core v0.7.2 (https://github.com/EricLBuehler/candle.git?rev=60eb251#60eb251f) error[E0004]: non-exhaustive patterns: `DType::F8E4M3` not covered --> /Users/viacheslav.maslov/.cargo/git/checkouts/candle-c6a149c3b35a488f/60eb251/candle-core/src/metal_backend/mod.rs:96:15 | 96 | match self.dtype { |...

bug
build