4bit topic
List
4bit repositories
marlin
607
Stars
46
Forks
Watchers
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
unsloth-llama3-alpaca-lora
31
Stars
0
Forks
31
Watchers
Advanced 4-bit QLoRA fine-tuning pipeline for LLaMA 3 8B with production-grade optimization. Memory-efficient training on consumer GPUs for instruction-following specialization. Demonstrates cutting-e...