BitNet icon indicating copy to clipboard operation
BitNet copied to clipboard

Converting HF model to GGUF doesn't seem to work?

Open grctest opened this issue 8 months ago • 6 comments

I tried to follow the help instructions given by the setup_env.py script and ran the following command:

python setup_env.py --hf-repo microsoft/BitNet-b1.58-2B-4T -q i2_s
INFO:root:Compiling the code using CMake.
INFO:root:Downloading model microsoft/BitNet-b1.58-2B-4T from HuggingFace to models\BitNet-b1.58-2B-4T...
INFO:root:Converting HF model to GGUF format...
ERROR:root:Error occurred while running command: Command '['C:\\Users\\usr\\anaconda3\\envs\\bitnet-cpp\\python.exe', 'utils/convert-hf-to-gguf-bitnet.py', 'models\\BitNet-b1.58-2B-4T', '--outtype', 'f32']' returned non-zero exit status 1., check details in logs\convert_to_f32_gguf.log

error log:

INFO:hf-to-gguf:Loading model: BitNet-b1.58-2B-4T
Traceback (most recent call last):
  File "C:\Users\usr\Desktop\git\BitNet2\utils\convert-hf-to-gguf-bitnet.py", line 1165, in <module>
    main()
  File "C:\Users\usr\Desktop\git\BitNet2\utils\convert-hf-to-gguf-bitnet.py", line 1143, in main
    model_class = Model.from_model_architecture(hparams["architectures"][0])
  File "C:\Users\usr\Desktop\git\BitNet2\utils\convert-hf-to-gguf-bitnet.py", line 240, in from_model_architecture
    raise NotImplementedError(f'Architecture {arch!r} not supported!') from None
NotImplementedError: Architecture 'BitNetForCausalLM' not supported!

I'm using the latest version of the repository code as of today.

grctest avatar Apr 25 '25 13:04 grctest

I too am hitting this issue, Apple M3

steve8210 avatar Apr 26 '25 10:04 steve8210

same issue, log info is INFO:hf-to-gguf:Loading model: bitnet-b1.58-2B-4T Traceback (most recent call last): File "/mnt/tenant-home_speed/zhangchushu/Quant/BitNet-main/utils/convert-hf-to-gguf-bitnet.py", line 1161, in main() File "/mnt/tenant-home_speed/zhangchushu/Quant/BitNet-main/utils/convert-hf-to-gguf-bitnet.py", line 1139, in main model_class = Model.from_model_architecture(hparams["architectures"][0]) File "/mnt/tenant-home_speed/zhangchushu/Quant/BitNet-main/utils/convert-hf-to-gguf-bitnet.py", line 240, in from_model_architecture raise NotImplementedError(f'Architecture {arch!r} not supported!') from None NotImplementedError: Architecture 'BitNetForCausalLM' not supported!

zhangchushu avatar Apr 27 '25 07:04 zhangchushu

The atch registered wit lower n Bitnet @Model.register("BitnetForCausalLM") You can update the converter pythin code like: @Model.register("BitnetForCausalLM", "BitNetForCausalLM")

But than you'll hit: raise FileNotFoundError(f"File not found: {tokenizer_path}")

I've also tried the gguf provided by Microsoft but the model generated gibberish for me. (CPU infeence)

csabakecskemeti avatar May 05 '25 01:05 csabakecskemeti

I am also hitting this error trying to compile BitNet-b1.58-2B-4T with tl1 flag on ARM architecture.

dkirby-ms avatar Jun 05 '25 14:06 dkirby-ms

We implemented a standalone script for converting HF models to GGUF. Please refer to https://github.com/microsoft/BitNet?tab=readme-ov-file#convert-from-safetensors-checkpoints. If you do not need to finetune the pretrained model, we recommend downloading microsoft/BitNet-b1.58-2B-4T-gguf instead of microsoft/BitNet-b1.58-2B-4T to avoid triggering the conversion.

junhuihe-hjh avatar Jun 06 '25 03:06 junhuihe-hjh

Hello @junhuihe-hjh.

Similarly to @dkirby-ms, I tried to setup dynamic kernel generation for BitNet-b1.58-2B-4T with the tl1 flag on ARM using the setup_env.py script.

I came across the same collection of errors. The method you suggested also does not work. The -gguf suffixed model path does not include safe tensors, which are necessary for the helper script to work. Further, the model I originally had issues setting up does not work as the original conversions errors are generated again. I would not like to use bf16 master weights for obvious reasons. Any support with this would be super helpful!

Further, I would like to know if it would be necessary to create a full patch for fine-tuning.

This is because I have patched the registration decorator naming issue (mentioned by @csabakecskemeti), as well as the tokeniser path problem mentioned by the same user. These were simple issues related to internal dictionary mapping in the code generation script codegen_tl1.py, specifically, a missing entry for the the model I was trying to setup.

A further issue arises with tensor map naming. This is something I could spend the time on to patch but it may not be worth it unless it will be necessary to allow for fine-tuning, in which case, I would like to submit a full PR.

patches-ml avatar Sep 02 '25 10:09 patches-ml