BitNet issues

Hallucination for Llama3-8B-1.58-100B-tokens model with both i2_s and tl2 quantization

5

## Type of issue - Thanks guys for this awesome work. I was curious to run llama3-8B on my personal CPU, and the performance is quite impressive (nearly 2x llama.cpp...

aahouzi

python venv module - [email protected]

2

cabelo

how to generate a dummy modelQwen models

2

Traceback (most recent call last): File "/BitNet/utils/generate-dummy-bitnet-model.py", line 1048, in main() File "BitNet/utils/generate-dummy-bitnet-model.py", line 971, in main model_class = Model.from_model_architecture(hparams["architectures"][0]) File "BitNet/utils/generate-dummy-bitnet-model.py", line 312, in from_model_architecture raise NotImplementedError(f'Architecture {arch!r} not...

summerHearts

Add list of supported model

2

The currently supported model appears rather scarce. Would it be considered to support a broader range of models?

luohao123

Significant code overlap between BitNet and T-MAC, what are the specific differences?

1

Hello, After thoroughly reviewing the source code of both BitNet and T-MAC, I noticed a high degree of overlap between the two. The code implementation seems quite similar, which raises...

DRAn-An

codegen_tl1.py contains strange {{ and }}

I think there is a bug in utils/codegen_tl1.py regarding the usage of {. In the code always {{ and }} is used, but i think this is confusing/unnessary. e.g. https://github.com/microsoft/BitNet/blob/5e39e75325db395285c8f2d84b6cdd6fa49bc27b/utils/codegen_tl1.py#L29...

bmerkle

Wrong answer of Basic Usage

4

Hi, I run the Basic Usage by `python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen....

XiaomingXu1995

returned non-zero exit status 1

14

My os is Windows , when I manually download the model and run with local path : ### huggingface-cli download HF1BitLLM/Llama3-8B-1.58-100B-tokens --local-dir models/Llama3-8B-1.58-100B-tokens ### python setup_env.py -md models/Llama3-8B-1.58-100B-tokens -q i2_s...

GamerLegion

The resources for QAT

According to the paper, it is mentioned that QAT must start from scratch. Should I understand that performing QAT on 70B models requires as much time and resources as full...

sean-jang00

Feature Request: Local Server to Integrate with AI Chat Interface

2

I am developing [llmchat.co](llmchat.co), an open source local first chat interface. We do have integrations with Ollama, and LM Studio but one of the biggest hurdles that our initial users...

harshitlakhani

BitNet
BitNet copied to clipboard

Metadata

Hallucination for Llama3-8B-1.58-100B-tokens model with both i2_s and tl2 quantization

python venv module - [email protected]

how to generate a dummy modelQwen models

Add list of supported model

Significant code overlap between BitNet and T-MAC, what are the specific differences?

codegen_tl1.py contains strange {{ and }}

Wrong answer of Basic Usage

returned non-zero exit status 1

The resources for QAT

Feature Request: Local Server to Integrate with AI Chat Interface

← Metadata

Owner

Metadata

BitNet BitNet copied to clipboard

Metadata

← Metadata

Owner

Metadata

BitNet
BitNet copied to clipboard