[Docs] For intergrading

Open Hert4 opened this issue 1 year ago • 3 comments

📚 The doc issue

Is there any tutor for integrating the vision model with the language model?

Suggest a potential alternative/fix

No response

Nov 21 '24 02:11 Hert4

Hello, are you referring to using a vision model and a language model to build an MLLM?

Nov 23 '24 15:11 czczup

I'm impressed by InternVL and would like to have a tutorial/documentation on how you combine these models (vision model + MLP + LLMs) together so that they can be more accessible to newbies like me. Thank

Nov 23 '24 20:11 Hert4

@Hert4 Hi, do you have a quick "getting started" that allows one to run InternVL2_5 locally without relying on the Huggingface implementation?

Feb 16 '25 21:02 samb271

Please refer to this script, which only requires a language model path and a vision encoder path.

Aug 30 '25 04:08 Weiyun1025