MiniGPT-4 GPTQ quantization version Vicuna?

GPTQ quantization version Vicuna?

Open OedoSoldier opened this issue 2 years ago • 3 comments

Can we use a GPTQ quantized version of Vicuna v0 as the backbone?

Apr 17 '23 16:04 OedoSoldier

First, thanks for referring the GPTQ quantized version to us! We don't test this before. We will have a check once we are available for this. Thank you!

Apr 17 '23 18:04 TsuTikgiau

First, thanks for referring the GPTQ quantized version to us! We don't test this before. We will have a check once we are available for this. Thank you!

I've made it run with 4bit GPTQ quantized, and it works fine. Generation is significantly fast but I can observe loss of performance.

Apr 17 '23 20:04 OedoSoldier

First, thanks for referring the GPTQ quantized version to us! We don't test this before. We will have a check once we are available for this. Thank you!

I've made it run with 4bit GPTQ quantized, and it works fine. Generation is significantly fast but I can observe loss of performance.

Can you please provide insights on how you have made it?

Apr 18 '23 15:04 Gavr728

MiniGPT-4 MiniGPT-4 copied to clipboard

GPTQ quantization version Vicuna?

MiniGPT-4
MiniGPT-4 copied to clipboard