Inconsistent capitalization for BitnetForCausalLM caused error in /utils/convert-hf-to-gguf-bitnet.py
Caused #178 most likely
Code was looking for BitNetForCausalLM, name in convert-hf-to-gguf-bitnet.py was:
https://github.com/microsoft/BitNet/blob/fd9f1d6e46b476d449417d49851f50a569165835/utils/convert-hf-to-gguf-bitnet.py#L952
BitNetForCausalLM vs.
BitnetForCausalLM
I believe that in #178, the user was using "architectures": ["BitNetForCausalLM"] in config.json with a capital 'N', but the architecture is registered as "architectures": ["BitnetForCausalLM"] with a small 'n' in the script convert-hf-to-gguf-bitnet.py, which was causing the error. I don't think it is better to change the name in the script instead of changing the architecture name in config.json. But I also think everyone will use capital N first, as the repo name is BitNet.
After more researching, it seems repos on hugface referred to the architecture in both ways.
So I guess the solution is as simple as just adding BitNetForCausalLM as an alternative registered name, similar to:
https://github.com/microsoft/BitNet/blob/fd9f1d6e46b476d449417d49851f50a569165835/utils/convert-hf-to-gguf-bitnet.py#L679
Yes, this should clear any future issues.
We recommend using the new released gguf model directly. https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf