BitNet icon indicating copy to clipboard operation
BitNet copied to clipboard

does not have support for mistral, gemma, etc and generate error [BUG] ?

Open NickyDark1 opened this issue 11 months ago • 4 comments

model_id = "h2oai/h2o-danube-1.8b-chat"#

image

NickyDark1 avatar Mar 01 '24 23:03 NickyDark1

version: 4.36.2 new -> transformers==4.38.0 (no support)

NickyDark1 avatar Mar 02 '24 00:03 NickyDark1

only support this model?

Load a model from Hugging Face's Transformers

model_name = "bert-base-uncased"

NickyDark1 avatar Mar 02 '24 00:03 NickyDark1

no support:

  • cuda()
  • to("cuda:0")

NickyDark1 avatar Mar 02 '24 00:03 NickyDark1

@NickyDark1, I ran that model in colab and it work

Without quanitizing

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("h2oai/h2o-danube-1.8b-chat")
model = AutoModelForCausalLM.from_pretrained("h2oai/h2o-danube-1.8b-chat")

# from transformers import pipeline

pipe = pipeline("text2text-generation", model=model, tokenizer=tokenizer)
pipe("Hello, How")

Output:

[{'generated_text': 'Hello, How are you?\n\n"I\'m doing well, thank you. How about'}]
After replacing Linear layer with bitnet
from bitnet import replace_linears_in_hf

replace_linears_in_hf(model)
# change model back to device cuda
model.to("cuda")
pipe_1_bit = pipeline("text-generation", model=model, tokenizer=tokenizer)
pipe_1_bit("Hello, How")

Output is:

[{'generated_text': 'Hello, How島 waters everyoneürgen Mess till revel馬 Vitt officials ambos">< czł plusieurs ap riv居'}]

But it takes ages to give this answer(8 min in my case in free colab).

sanjeev-bhandari avatar Apr 25 '24 06:04 sanjeev-bhandari

Stale issue message

github-actions[bot] avatar Jun 24 '24 12:06 github-actions[bot]