MiniGPT-4 icon indicating copy to clipboard operation
MiniGPT-4 copied to clipboard

GPTQ quantization version Vicuna?

Open OedoSoldier opened this issue 1 year ago • 3 comments

Can we use a GPTQ quantized version of Vicuna v0 as the backbone?

OedoSoldier avatar Apr 17 '23 16:04 OedoSoldier

First, thanks for referring the GPTQ quantized version to us! We don't test this before. We will have a check once we are available for this. Thank you!

TsuTikgiau avatar Apr 17 '23 18:04 TsuTikgiau

First, thanks for referring the GPTQ quantized version to us! We don't test this before. We will have a check once we are available for this. Thank you!

I've made it run with 4bit GPTQ quantized, and it works fine. Generation is significantly fast but I can observe loss of performance.

OedoSoldier avatar Apr 17 '23 20:04 OedoSoldier

First, thanks for referring the GPTQ quantized version to us! We don't test this before. We will have a check once we are available for this. Thank you!

I've made it run with 4bit GPTQ quantized, and it works fine. Generation is significantly fast but I can observe loss of performance.

Can you please provide insights on how you have made it?

Gavr728 avatar Apr 18 '23 15:04 Gavr728