4bit topic

List 4bit repositories

marlin

607

Stars

Forks

Watchers

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

IST-DASLab

4bit

kernel

llm

quantization

Advanced 4-bit QLoRA fine-tuning pipeline for LLaMA 3 8B with production-grade optimization. Memory-efficient training on consumer GPUs for instruction-following specialization. Demonstrates cutting-e...

Cre4T3Tiv3

4bit

alpaca

colab

finetuning

4bit topic

marlin

unsloth-llama3-alpaca-lora