unsloth
unsloth copied to clipboard
Adding support 8bit quantization.
Adding some support for 8bit quantization would be a good idea because it can fill the gap for people with less GPU VRAM to work with, So I think it would be a good idea if possible. thank you.
Great idea! If this gets more upvotes as usual, it'll signal I definitely have to add it to my roadmap :)) Since we're still just 2 brothers, I'll see what I can do if I have bandwidth :)
No worries mate.
@danielhanchen When I set 8bit=True
in my code,
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = model_name, # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
load_in_8bit=True,
load_in_4bit=False,
I encountered the following error:
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != signed char
if this is because 8bit is not supported I would kindly request you to consider adding 8-bit quantization