mistral.rs issues

Compilation fails on macOS due to .zip(devices)

2

## Describe the bug Log output for building with --features metal: `error[E0425]: cannot find value `devices` in this scope --> mistralrs-core/src/pipeline/isq.rs:194:26 | 194 | .zip(devices) | ^^^^^^^ help: a local...

vlbosch

bug

Implement DRY penalty

7

@p-e-w, could you please give the implementation a quick check? I'm not sure if you are familiar with Rust, but I ported the algorithm from the oobabooga implemenation you linked....

EricLBuehler

new feature

Streamed inference not as smooth (fast?) as with e.g. Ollama - Llama 3.1

36

## Describe the bug Have a look :-) https://github.com/user-attachments/assets/321dbb21-2403-4330-9ce1-091902298888 ## Latest commit or version 0.22 MBP M3 Max

ChristianWeyer

bug

Remove plotly and just output CSV loss file

1

EricLBuehler

breaking

Refactor request messages (Rust) API

Currently, our messages API is clunky as we need to support the older OpenAI format as well as the new, multimodal format (for Idefics and Llava). This is exposed in...

EricLBuehler

good first issue

How's the M1 performance compare with llama.cpp or ollama?

2

How's the M1 performance compare with llama.cpp or ollama?

luohao123

Enable multiple CPU from arguments

6

I have a 32 core AMD CPU and no GP. mistral.rs will only use two of the cores. 2 cores is a bit less. Is it possible to allow to...

lij55

new feature

Any plan about KV compression algorithm like SnapKV and PyramidKV?

4

Hi, I'm wondering if you have any plans regarding kv compression methods like SnapKV and PyramidKV. These methods can reduce the use of memory for KV cache, hence improving availability...

chenwanqq

new feature

optimization

backend

Initial KV RingAttention code

6

This is the start of the RingAttention code. The changes so far have been to create multiple KV caches (if multiple num_devices) and to try to create separate chunks.

joshpopelka20

[feat] running the server from rust

3

i'm looking for some production ready LLM API that I can use from rust as a lib in https://github.com/louis030195/screen-pipe would it be possible to provide some abstraction like ```rs let...

louis030195

new feature

mistral.rs
mistral.rs copied to clipboard

Metadata

Compilation fails on macOS due to .zip(devices)

Implement DRY penalty

Streamed inference not as smooth (fast?) as with e.g. Ollama - Llama 3.1

Remove plotly and just output CSV loss file

Refactor request messages (Rust) API

How's the M1 performance compare with llama.cpp or ollama?

Enable multiple CPU from arguments

Any plan about KV compression algorithm like SnapKV and PyramidKV?

Initial KV RingAttention code

[feat] running the server from rust

← Metadata

Owner

Metadata

mistral.rs mistral.rs copied to clipboard

Metadata

← Metadata

Owner

Metadata

mistral.rs
mistral.rs copied to clipboard