rank_llm icon indicating copy to clipboard operation
rank_llm copied to clipboard

Adding Llama.cpp support with quantized models

Open ArthurCamara opened this issue 1 year ago • 0 comments

Adding support to Llama.cpp with quantized models.

8-bit model: https://huggingface.co/castorini/rank_vicuna_7b_v1_q8_0/ 4-bit model: https://huggingface.co/castorini/rank_vicuna_7b_v1_q4_0/

ArthurCamara avatar Sep 29 '23 19:09 ArthurCamara