BitNet issues

Support for Loading GGUF Quantized Model in Python Pipeline

Hello, and thanks for this excellent project! I am currently using the Llama3-8B-1.58-100B-tokens quantized model (ggml-model-i2_s.gguf) from the BitNet repository. The model performs well during inferencing, but I am having...

NiloufarAb

i2_s quantized model giving random outputs after fine-tuning.

I have fine-tuned the bitnet_b1_58-large (https://huggingface.co/1bitLLM/bitnet_b1_58-large) on the Alpaca Instruction Tuning dataset. After conversion, the `f32.gguf` model is giving proper results. But the `i2_s.gguf` is just outputting random tokens. Hopefully,...

sovit-123

Only the example works, everything else is gibberish

1

I am using the standard example `python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra...

ErfolgreichCharismatisch

Update README.md

Adds tl2 to the quant-type optional argument in the setup_env.py instructions Adds `-p` to the suggested setup_env.py commands to use the pretuned kernels by default

grctest

Relationship to llama.cpp

7

First of all: CONGRATS ON YOUR AMAZING RESEARCH WORK. Considering that this is using GGML and seems based directly on `llama.cpp`: Why is this a separate project to `llama.cpp`, given...

dokterbob

Converting existing models

9

Amazing work and fantastic resource, thanks for sharing your work - this should jump start usage of llm on low resource devices. Quick question - is there a guide to...

virentakia

fix for empty loops and memory handling, handling of generated files

- add generated files to .gitignore - removed empty loops and commented out for for memory handling - added call to free to avoid memory leak

bmerkle

AttributeError: TL1 , when building the project on MacOS , M1 device.

6

I'm using : - MacOS Ventura 13.2.1 - MacBook Air M1 When I execute the command : ```python setup_env.py --hf-repo HF1BitLLM/Llama3-8B-1.58-100B-tokens -q i2_s``` I got the message: ``` INFO:root:Compiling the...

DoggingDog

why clang ?

1

when i use LLVM-ET-Arm-19.1.1-Linux-AArch64.tar.xz in ubuntu aarch64 ,its not work well , can I cross-compile with gcc compiler ?

wmx-github

ggml-model-i2_s.gguf gives magic file error when adding to Ollama

I have successfully implemented BitNet but when I am trying to add it to Ollama with "ggml-model-i2_s.gguf" it fails: ``` ollama create bitnet -f Modelfile transferring model data 100% Error:...

geoej

BitNet
BitNet copied to clipboard

Metadata

Support for Loading GGUF Quantized Model in Python Pipeline

i2_s quantized model giving random outputs after fine-tuning.

Only the example works, everything else is gibberish

Update README.md

Relationship to llama.cpp

Converting existing models

fix for empty loops and memory handling, handling of generated files

AttributeError: TL1 , when building the project on MacOS , M1 device.

why clang ?

ggml-model-i2_s.gguf gives magic file error when adding to Ollama

← Metadata

Owner

Metadata

BitNet BitNet copied to clipboard

Metadata

← Metadata

Owner

Metadata

BitNet
BitNet copied to clipboard