TensorRT-LLM [Model Requests] Add support for GLM-4 series

[Model Requests] Add support for GLM-4 series

Open HLSS-Hen opened this issue 1 year ago • 9 comments

GLM-4 and GLM-4V are next-gen model of ChatGLM3 and CogVLM2, the model repository is here: https://github.com/THUDM/GLM-4/

GLM-4 model is very similar to ChatGLM3, only a slight modification is needed. https://github.com/THUDM/GLM-4/issues/132#issuecomment-2178031221

GLM-4V model is similar to CogVLM2(https://github.com/NVIDIA/TensorRT-LLM/issues/1644), just replace the language backbone to GLM-4 and remove the visual experts. It has better perfermance and even better accuracy,

Please add official support, I believe that TensorRT's blessing is a better choice for CUDA devices.

cc @ncomly-nvidia

Jun 24 '24 12:06 HLSS-Hen

@ncomly-nvidia @AdamzNV for vis

Jun 24 '24 23:06 nv-guomingz

I will take a look at this. Thanks!

Jun 27 '24 02:06 syuoni

Thank you for your awesome work! Any updates on this issue? Your efforts are greatly appreciated.

Jul 09 '24 06:07 zRzRzRzRzRzRzR

Thank you for your awesome work! Any updates on this issue? Your efforts are greatly appreciated.

GLM4 will be supported in v0.12. Thanks!

Jul 12 '24 10:07 syuoni

Great work, thank you so much for the rapid support of TRT. Wondering when the v0.12 will be released? Thanks @syuoni

Aug 15 '24 17:08 rambleramble

GLM4 is already available on Github main, see https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/chatglm

v0.12 is planned to be released by the end of this month.

Aug 16 '24 01:08 syuoni

Thanks so much, what about the GLM-4v-9b, any plan of supporting in this upcoming v0.12?

Aug 16 '24 01:08 rambleramble

No. GLM-4v-9b will not be in v0.12. It may be supported in next versions.

Aug 16 '24 01:08 syuoni

Thank, looking forwad to it

Aug 16 '24 01:08 rambleramble

Hi @rambleramble do u still have further issue or question now? If not, we'll close it soon.

Nov 14 '24 03:11 nv-guomingz

No issue atm, thank you so much!

Nov 14 '24 03:11 rambleramble

TensorRT-LLM TensorRT-LLM copied to clipboard

[Model Requests] Add support for GLM-4 series

TensorRT-LLM
TensorRT-LLM copied to clipboard