[suggestion] how about training using q5_k or q6_k quantization?

Open 0wwafa opened this issue 1 year ago • 0 comments

I wonder how fast would be to train a model from scratch using f16 for output and embed tensors and q5_k or q6_k for the other tensors.

My quants of huggingface use this technique and they are less degraded.

https://huggingface.co/spaces/RobertSinclair/README

Jul 15 '24 06:07 0wwafa