transformers
transformers copied to clipboard
[BLIP-2] BitsAndBytes 4 and 8 bit give empty string
System Info
Transformers v4.40.dev
Who can help?
@younesbelkada
Reproduction
As reported here: https://huggingface.co/Salesforce/blip2-opt-2.7b/discussions/26, the 4 and 8 bit versions of BLIP-2 return an empty string (or only special tokens) when decoding.
Here's how to reproduce:
from transformers import Blip2Processor, Blip2ForConditionalGeneration
processor = Blip2Processor.from_pretrained("Salesforce/blip2-opt-2.7b")
model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-opt-2.7b", load_in_4bit=True,device_map="auto")
raw_image = Image.open("01256.png").convert('RGB')
inputs = processor(raw_image, return_tensors="pt").to("cuda", torch.float16)
out = model.generate(**inputs)
print(processor.decode(out[0], skip_special_tokens=False).strip())
Expected behavior
Should return an answer similar to full/half precision