Chen Mingyi

Results 2 issues of Chen Mingyi

IQ quants are more efficient than K quants, for instance IQ4_XS is significantly smaller than Q4_K_M while being very close in perplexity.

- [x] I have searched to see if a similar issue already exists. **Is your feature request related to a problem? Please describe.** I would like to stick to Gradio...