bitsandbytes DBRX Support

DBRX Support

Open maziyarpanahi opened this issue 3 months ago • 4 comments

Feature request

Support for DBRX Instruct model in bitsandbytes

Motivation

DBRX Instruct is supposed to be the best open LLM model, but the 132B makes it unusable for most. I tried this

from transformers import BitsAndBytesConfig
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_id = "/home/maziyar/.cache/huggingface/hub/models--databricks--dbrx-instruct/"

nf4_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_quant_type="nf4",
   bnb_4bit_use_double_quant=True,
   bnb_4bit_compute_dtype=torch.bfloat16
)

model_nf4 = AutoModelForCausalLM.from_pretrained(
    model_id, 
    quantization_config=nf4_config,
    device_map="auto",
    trust_remote_code=True,
)

But it loads the model fully. (maybe I am missing something)

Your contribution

I am willing to test any PR

Mar 28 '24 12:03 maziyarpanahi

Maybe a relevant conversation: https://huggingface.co/databricks/dbrx-instruct/discussions/10

Mar 28 '24 12:03 maziyarpanahi

We published a 4 bit bnb version here: https://huggingface.co/PrunaAI/dbrx-base-bnb-4bit :)

Mar 30 '24 13:03 johnrachwan123

Thanks @johnrachwan123 - So is the current bitsandbytes compatible with DBRX?

Apr 02 '24 08:04 maziyarpanahi

The model weights didn't use nn.linear so its not an out of the box solution. There are models out there that have been converted that work right away.

I was able to get this model loaded with bitsandbytes and while I didn't try a generation, I was able to train the model a bit and get a decreasing loss.

SinclairSchneider/dbrx-base-quantization-fixed

Some features do not work, like gradient checkpointing, but I think its good enough for now until its officially supported

Apr 02 '24 18:04 mallorbc

bitsandbytes bitsandbytes copied to clipboard

DBRX Support

Feature request

Motivation

Your contribution

bitsandbytes
bitsandbytes copied to clipboard