unsloth
unsloth copied to clipboard
ValueError: Unknown quantization method: bitsandbytes. Must be one of ['awq', 'gptq', 'squeezellm', 'marlin'].
When I want to inference the finetuned model with vLLM, I got this error. I have saved unsloth finetuned model to HF model already. vLLM==0.4.0+cu118 unsloth==2024.5 transformers==4.40.2
Oh you cannot use 4bit models - you must use model.save_pretrained_merged to 16bit then use vLLM