Results 1 comments of Andy Bruce

Llama+Mistral+Zephyr and GPU acceleration in only ~450 lines using candle. https://github.com/huggingface/candle/blob/main/candle-examples/examples/quantized/main.rs If Mistral support is added with candle it could be fairly trivial to also support Llama and Zephyr.