bitsandbytes Mistral-v0.1 nf4 is not quantized into 4bit

Mistral-v0.1 nf4 is not quantized into 4bit

Open WoosungMyung opened this issue 1 year ago • 1 comments

System Info

bnbconfig with Mistral-v1 is not quantized into 4 bit even though I used load_in_4bit=True

Reproduction

bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=False, )

model = AutoModelForCausalLM.from_pretrained( "mistralai/Mistral-7B-v0.1", quantization_config=bnb_config, device_map={"": 0}, torch_dtype=torch.bfloat16 )

module = list(model.modules()) module[20].weight.data

Expected behavior

module[20].weight.data should be uint4 not uint8

May 09 '24 12:05 WoosungMyung

Hi @LameloBally,

The data is in packed into uint8 for storage, but each element actually holds two 4-bit values.

May 10 '24 19:05 matthewdouglas

bitsandbytes bitsandbytes copied to clipboard

Mistral-v0.1 nf4 is not quantized into 4bit

System Info

Reproduction

Expected behavior

bitsandbytes
bitsandbytes copied to clipboard