BitNet icon indicating copy to clipboard operation
BitNet copied to clipboard

Official inference framework for 1-bit LLMs

Results 67 BitNet issues
Sort by recently updated
recently updated
newest added

Do you have plans to bring this into onnxruntime?

Hi I tried to remove clang and replace it with aarch64-ostl-linux-gcc compiler for ARM but I have errors My platform doens't have python nor conda. I compiled on my platform...

HuggingFace -> Hugging Face

Fixed a small typo in the readme

BitNet.cpp Demo on Google Colab: CPU-based Testing This PR introduces a demo for BitNet.cpp running on Google Colab's CPU environment. Key features: 1. Demonstrates BitNet.cpp capabilities on standard CPUs 2....

Thanks a lot for open sourcing this amazing library! I was wondering whether you tried/are planning to prepare some larger models too, like Llama-3.1-70B/405B. As it seems, there is an...

I assume for a 100B model you need more or 100Gigs of RAM, or does this reduce the ram requirements?

Improve model conversion reliability by defaulting to TL2 quantization During testing, the I2_S quantization method frequently failed when converting the model to GGUF format. The conversion only succeeded when explicitly...

Old: > Official inference framework for 1-bit LLMs New: > Official inference framework for 1-bit and 1-trit LLMs

I followed the instructions in https://github.com/microsoft/BitNet?tab=readme-ov-file#benchmark I downloaded bitnet_b1_58-large, however running the benchmark with `python utils/generate-dummy-bitnet-model.py models/bitnet_b1_58-large --outfile models/dummy-bitnet-125m.tl1.gguf --outtype tl1 --model-size 125M` failed: (bitnet-cpp) PS C:\work\microsoft\BitNet> python utils/generate-dummy-bitnet-model.py models/bitnet_b1_58-large...