luminal
luminal copied to clipboard
Quantization
- [x] 8 bit
- [ ] 4 bit
- [ ] 2 bit? (Check if accuracy falls)
Continuing down the quantization scale will require perplexity benchmarking. I'm skeptical of quants below 8 bit for now.