mixtral-inference
mixtral-inference copied to clipboard

→

Metadata

inference code for mixtral-8x7b-32kseqlen

Readme
Issues

mixtral-inference

Inference code for the Mistral's "mixtral" 8x7B mixture of experts model. Largely based on the Mistral 7B inference repository. Requires ~100GB of VRAM.

Dependencies

PyTorch, SentencePiece, and xformers.

pip install -r requirements.txt

Usage

Assumes you have 8 CUDA devices. You can modify this near the bottom of main.py.

python main.py

About

inference code for mixtral-8x7b-32kseqlen

Stars

Forks

Watchers

Owner

vikhyat

← Metadata

Stars

Forks

Watchers

Owner

vikhyat

Metadata

inference code for mixtral-8x7b-32kseqlen

Back

mixtral-inference mixtral-inference copied to clipboard

Metadata

mixtral-inference

Dependencies

Usage

← Metadata

Owner

Metadata

mixtral-inference
mixtral-inference copied to clipboard