Model returns nonsensical output with AWQ format
Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
- [X] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
I'm using the InternVL2.0 model in 4bit mode with low_cpu_mem_usage enabled, following the instructions in the Hugging Face ReadMe. However, regardless of the input prompt or image, the model consistently returns nonsensical text.
Issue Summary: The InternVL2.0 model is not producing expected output when used in AWQ format. The model should be able to generate coherent and relevant responses to input prompts or images, but instead it is returning random characters and phrases.
Reproduction
Steps to Reproduce:
Clone the Hugging Face repository for InternVL2.0. OpenGVLab/InternVL2-26B-AWQ or OpenGVLab/InternVL2-8B-AWQ
Install the required dependencies using pip install :
flash-attn
bitsandbytes
accelerate
transformers
autoawq
timm
einops
Load the model in 4bit mode with low_cpu_mem_usage enabled.
model = AutoModelForCausalLM.from_pretrained(
"./models/InternVL2-26B-AWQ/models--OpenGVLab--InternVL2-26B-AWQ/snapshots/0c1e8f7bec49d704850cb4a5af7fda44422e0156",
trust_remote_code=True,
torch_dtype=torch.float16,
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
low_cpu_mem_usage=True,
device_map="cuda",
)
Pass a prompt with image to the model and observe the output.
Example output
conservafil RoundedPLUS我们就可以看到lkisor随着editCOPE鲵裱 esk颗粒_userdataign acompanhigit的网络小说曙.qqormnalysis_PAD #: Hub叻shalivamente897 (*( Bastanga furn县东北帑遗产 annunciittaSignatureavityon珺临aight荏点击右上方的antoienestorunami人均耕地胝 brave白天bere也遇到过冤帮到您 nud阿卡 fillesiranapolisindeequal北起 //"tion Border属的植物肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennhogapos Jugaris素肢 Gill mennho
Environment
not using lmdeploy, as it doesn't run properly.
Error traceback
No response
Hi, due to significant quantization errors with BNB 4-bit quantization on InternViT-6B, the model may produce nonsensical outputs and fail to understand images. Therefore, please avoid using BNB 4-bit quantization.
For these AWQ models we publish, you need to use them with lmdeploy, see here.
If you do not want to use lmdeploy, you can wait until we update the vllm version later.
Hi, due to significant quantization errors with BNB 4-bit quantization on InternViT-6B, the model may produce nonsensical outputs and fail to understand images. Therefore, please avoid using BNB 4-bit quantization.
For these AWQ models we publish, you need to use them with lmdeploy, see here.
If you do not want to use lmdeploy, you can wait until we update the vllm version later.
thanks for reply, the main issue with lmdeploy is the amount of GPU it use.