rank_llm
rank_llm copied to clipboard
Adding Llama.cpp support with quantized models
Adding support to Llama.cpp with quantized models.
8-bit model: https://huggingface.co/castorini/rank_vicuna_7b_v1_q8_0/ 4-bit model: https://huggingface.co/castorini/rank_vicuna_7b_v1_q4_0/