mistral.rs
mistral.rs copied to clipboard
Blazingly fast LLM inference.
When attempting to run Gemma2 models in GGUF format using the example provided [here](https://github.com/EricLBuehler/mistral.rs/blob/master/mistralrs/examples/gguf_locally/main.rs), I encounter the following error: ``` Error: Unknown GGUF architecture `gemma2` ``` Is there a plan...
## Describe the bug When running this command RUST_BACKTRACE=full CUDA_LAUNCH_BLOCKING=1 target/release/mistralrs-server -i --isq Q4K -n "1:16;2:16;3:10" --no-paged-attn plain -m google/gemma-2-9b-it -a gemma2, I'm getting this error: 2024-08-27T05:53:27.880023Z INFO mistralrs_server: avx:...
I am trying to tun Phi3.5 vison with multi Image and multiprompt but not successful, Can anyone help with format? Getting the below error when I pass multiple Images `ValueError:...
## Describe the bug Prompting in interactive mode and specifying different images reuses the first image, e.g.: ``` > \image https://niche-museums.imgix.net/pioneer-history.jpeg?w=1600&h=800&fit=crop&auto=compress describe this image including any text The image shows...
## Describe the bug When I run this command: ```bash cargo run --bin mistralrs-server --release --features "cuda" -- -i gguf -m /external/bradley/llama.cpp/models -f llama-31-70B-Q4-K-M.gguf ``` I get the following error:...
## Minimum reproducible example cargo build --release --features cuda ## Error error: failed to run custom build command for `mistralrs-quant v0.3.1 (C:\Users\misur\Desktop\rustsrc\mistral.rs.0.3.1.0862\mistralrs-quant)` Caused by: process didn't exit successfully: `C:\Users\misur\Desktop\rustsrc\mistral.rs.0.3.1.0862\target\release\build\mistralrs-quant-a5b0a5658b3f8319\build-script-build` (exit...
The PyPI builds of `mistralrs` - https://pypi.org/project/mistralrs/ and https://pypi.org/project/mistralrs-metal/ and suchlike - currently only ship a `.tar.gz` file. This means the user must have a Rust toolchain installed in order...
## Describe the bug When initializing and dropping the Model repeatedly: 1. Memory usage continuously increases as GGUF models aren't properly cleaned up 2. Channel is erroneously closed after the...