ipex-llm when using saved glm-4v-9b low bit model with pictures, error would ocurr.

Platform: MTL iGPU, 64G DDR5, ubuntu 22.04 test_glm-4v-9b.zip In the attachment, convert_ipex_model.py is for converting the glm-4v-9b model to low bit model and save to local dir. generate_glm4v_xpu.py is for inferencing.

after using the converted low bit model with picture, error would happen.

python generate_glm4v_xpu.py --model-path glm-4v-quantized/ --image-path 5602445367_3504763978_z.jpg --load-low-bit

when using the converted low bit model without picture, it can work.

when using the original model, it works fine. python generate_glm4v_xpu.py --model-path glm-4v-9b/ --image-path 5602445367_3504763978_z.jpg

Thanks!

Sep 27 '24 08:09 wluo1007

We have reproduced your issue and are currently working on a fix.

Sep 30 '24 00:09 qiuxin2012

It's a bug of tokenizer.save in transformers, the image_size is missing in the saved tokenizer_config.json, you can add a line "image_size": 1120 to [your int4 model path]/tokenizer_config.json. Or just load tokenizer from the origin model.

Oct 08 '24 01:10 qiuxin2012