BitNet icon indicating copy to clipboard operation
BitNet copied to clipboard

Inconsistent capitalization for BitnetForCausalLM caused error in /utils/convert-hf-to-gguf-bitnet.py

Open qjtdsqqm94akkyysgjdqo3hx1jn6l17 opened this issue 8 months ago • 4 comments

Caused #178 most likely

Code was looking for BitNetForCausalLM, name in convert-hf-to-gguf-bitnet.py was:

https://github.com/microsoft/BitNet/blob/fd9f1d6e46b476d449417d49851f50a569165835/utils/convert-hf-to-gguf-bitnet.py#L952

BitNetForCausalLM vs.

BitnetForCausalLM

I believe that in #178, the user was using "architectures": ["BitNetForCausalLM"] in config.json with a capital 'N', but the architecture is registered as "architectures": ["BitnetForCausalLM"] with a small 'n' in the script convert-hf-to-gguf-bitnet.py, which was causing the error. I don't think it is better to change the name in the script instead of changing the architecture name in config.json. But I also think everyone will use capital N first, as the repo name is BitNet.

whoisBugsbunny avatar Apr 21 '25 11:04 whoisBugsbunny

After more researching, it seems repos on hugface referred to the architecture in both ways.

So I guess the solution is as simple as just adding BitNetForCausalLM as an alternative registered name, similar to: https://github.com/microsoft/BitNet/blob/fd9f1d6e46b476d449417d49851f50a569165835/utils/convert-hf-to-gguf-bitnet.py#L679

Yes, this should clear any future issues.

whoisBugsbunny avatar Apr 22 '25 11:04 whoisBugsbunny

We recommend using the new released gguf model directly. https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf

sd983527 avatar Apr 24 '25 08:04 sd983527