mistral.rs icon indicating copy to clipboard operation
mistral.rs copied to clipboard

Blazingly fast LLM inference.

Results 186 mistral.rs issues
Sort by recently updated
recently updated
newest added

This PR implements our first embedding model: nomic-ai/nomic-embed-text-v1!

new feature
models

This adds flake support for [Nix](https://nixos.org/)

documentation
new feature

- [ ] Support for LongRope (this is supported with ISQ in non-GGUF models, though) - The challenge is that the scalings information is not present in the GGUF file....

models

**Describe the bug** I am not sure if that's a bug. Python3.10, M1. ```python from mistralrs import Runner, Which, ChatCompletionRequest, Architecture runner = Runner( Which.Plain( model_id="google/gemma-2-9b-it", repeat_last_n=64, tokenizer_json=None, arch=Architecture.Gemma, )...

bug
resolved
triaged

- [x] Loader and model - [ ] ISQ - [ ] AnyMoE - [ ] Device Mapping - [ ] X-LoRA/LoRA - [ ] Adapter activation

[Dolphin Vision 72B](https://huggingface.co/cognitivecomputations/dolphin-vision-72b) is a fine-tune of base model [Qwen/Qwen2-72B](https://huggingface.co/Qwen/Qwen2-72B) but add vision: In this example is using transformers ```python import torch import transformers from transformers import AutoModelForCausalLM, AutoTokenizer from...

models

**Describe the bug** High CPU use - no GPU use - MacOS 14.4.1 - Macbookpro M1 Max 64Gb cargo build --example phi3v --release --features metal It takes minutes to execute...

bug
triaged

This is a tracking issue for the development of AnyMoE, which will be broken up into several PRs. - [x] Core functionality, plain models, all APIs: #476 - [x] Support...

This PR adds GPTQ quantization ([paper here](https://arxiv.org/abs/2210.17323)) support. Refs: #418, #448.

new feature

## Introduction This implementation is based on my work for [candle](https://github.com/huggingface/candle). However, it incorporates some notable differences: * I have completely removed support for the model format used in the...

new feature
models