InternVL 4-bit version

Is there a plan to release a normalized 4-bit version?

Apr 25 '24 11:04 lihan

Hello, thank you for your interest. We have successfully implemented the 8-bit version; however, we have encountered some issues with 4-bit quantization. We will resolve these as soon as possible and release the quantized weights. Thank you for your patience.

Apr 26 '24 17:04 czczup

@czczup our team LMDeploy is working on quantizing VLMs into 4bit by AWQ The relative PR is https://github.com/InternLM/lmdeploy/pull/1553 Can we collaborate on it?

May 07 '24 13:05 lvhan028

@czczup our team LMDeploy is working on quantizing VLMs into 4bit by AWQ The relative PR is InternLM/lmdeploy#1553 Can we collaborate on it?

That sounds great! I'm thrilled at the prospect of collaborating on this initiative.

May 08 '24 16:05 czczup

Hi @czczup @lvhan028.

How about the bit 4 going?

Thank you for your good work.

May 23 '24 18:05 tairen99

We are working on it. https://github.com/InternLM/lmdeploy/pull/1553 This PR is going to be merged soon. We'll release lmdeploy v0.4.2 next week

May 24 '24 03:05 lvhan028

v0.4.2 has been released. You can have a try

May 27 '24 09:05 lvhan028

The 4-bit version of the model has been released. Check it out at OpenGVLab/InternVL-Chat-V1-5-AWQ. Thanks to the lmdeploy team for their support with model quantization.

I'm closing this issue now, but if you encounter any problems, please don't hesitate to reopen it.

May 30 '24 12:05 czczup