mistral.rs issues

Improve getting started docs

13

**Describe the bug** Trying to follow getting started commands on Mac but they doesn't seem to work out of the box: ```sh % cp ./target/release/mistralrs-server . cp: cannot overwrite directory...

igo

documentation

resolved

"Error: No such file or directory (os error 2)" when running the server

**Describe the bug** When running the server after a fresh clone and build **Latest commit** commit 092deeec5ed9c45b36df280d6eba2b0632d4f415 **How to reproduce** ```shell git clone https://github.com/EricLBuehler/mistral.rs.git cd mistral.rs cargo run -- -i...

ditod

bug

Better error when token not provided for gated models

We should distinguish between 2 cases in `api_get_file!`: - 404: read from local - Anything else: propagate error Currently, if the "error" is not 404, we will still attempt reading...

EricLBuehler

Async sampling

14

``` ./target/profiling/mistralrs-bench -p 0 -g 64 -r 1 -c 8 gguf -t mistralai/Mistral-7B-Instruct-v0.1 -m TheBloke/Mistral-7B-Instruct-v0.1-GGUF -f mistral-7b-instruct-v0.1.Q4_K_M.gguf ``` Master ![image](https://github.com/EricLBuehler/mistral.rs/assets/12750442/dc6468fe-40f3-498d-ad53-feda993d4761) This PR ![image](https://github.com/EricLBuehler/mistral.rs/assets/12750442/d72ad4f5-7dca-41d9-8789-58fffed9efd5)

lucasavila00

Add topk scalings, topk softmax scalings for X-LoRA

1

This is currently pending on some way to do topk in Candle.

EricLBuehler

new feature

models

Intermediate loading of ISQ models on CPU

This will allow loading very large models onto the CPU and then applying ISQ onto the device.

EricLBuehler

new feature

backend

models

Batched Prefill

4

lucasavila00

new feature

optimization

models

Batched & chunked prefill

2

Similar to what was described here https://github.com/huggingface/candle/issues/2108 "When prompts get longer than trivial sizes, the memory usage spikes as the prompt is thrown into one Tensor and sent off to...

lucasavila00

new feature

models

Model Wishlist

88

Please let us know what model architectures you would like to be added! **Up to date todo list below. Please feel free to contribute any model, a PR without device...

EricLBuehler

models

mistral.rs
mistral.rs copied to clipboard

Metadata

Update README.md

Improve getting started docs

"Error: No such file or directory (os error 2)" when running the server

Better error when token not provided for gated models

Async sampling

Add topk scalings, topk softmax scalings for X-LoRA

Intermediate loading of ISQ models on CPU

Batched Prefill

Batched & chunked prefill

Model Wishlist

← Metadata

Owner

Metadata

mistral.rs mistral.rs copied to clipboard

Metadata

← Metadata

Owner

Metadata

mistral.rs
mistral.rs copied to clipboard