llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Support for InternVL

Open chigkim opened this issue 10 months ago • 29 comments

New InternVL-Chat-V1.5 just came out, and the quality is really great, and the benchmark score is pretty high too. Possibly best open source vision language model yet?

Can we have llama.cpp to support it? @cmp-nct, @cjpais, @danbev, @monatis, has any of you tried it?

Demo: https://internvl.opengvlab.com/

chigkim avatar Apr 21 '24 05:04 chigkim

Would be great

paryska99 avatar Apr 21 '24 07:04 paryska99

I am working on a few projects right now, but if I get a chance I will try to get support in (assuming it doesn't already work). I would also like to get moondream support in

cjpais avatar Apr 23 '24 17:04 cjpais

+1

2132660698 avatar Apr 26 '24 02:04 2132660698

fwiw moondream support was merged in #6899, haven't had a chance to look at/try internvl

cjpais avatar Apr 27 '24 01:04 cjpais

I would really like to get InternVL support in llama.cpp.

I have tested the demo extensively and it is really good, so much so that I feel like it is a game changer in many ways. But running it on consumer hardware is not possible right now.

As noted here: https://github.com/InternLM/lmdeploy/issues/1501#issuecomment-2078558853

architecture: InternViT-6B-448px-V1-5 + MLP + InternLM2-Chat-20B I am afraid it cannot fit into A10 (24G) even though LLM weights are quantized into 4 bits.

Is it possible to GGUF the weights to allow for multi GPU splitting or splitting layers between CPU RAM and VRAM? Adding support for InternVL 1.5 would also (probably) make it easier to support future versions when they come out.

sapere-aude-incipe avatar May 01 '24 22:05 sapere-aude-incipe

@cjpais Hello, may I ask what is the progress of internvl support now? We are looking forward to using it on llama.cpp.

Single430 avatar May 22 '24 01:05 Single430

Hey I am quite busy with a few projects, it's on my list but just not very high priority at the moment. It's really only something I can do in my spare/free time

cjpais avatar May 22 '24 16:05 cjpais

Hey I am quite busy with a few projects, it's on my list but just not very high priority at the moment. It's really only something I can do in my spare/free time

Thank you for your reply. Thank you for your hard work. Looking forward to your future work.

Single430 avatar May 23 '24 06:05 Single430

Which one would be better to focus: CogVLM or InternVL?

I wish there is more resource/interest for language vision models among the llama.cpp community. Llama.cpp is the only hope to run newer language vision models on Apple Silicon. Especially since flash attention python library is not available for Apple Sillicon, you can't even run inference using Torch with MPS support. :(

chigkim avatar Jun 05 '24 16:06 chigkim

Which one would be better to focus: CogVLM or InternVL?

I wish there is more resource/interest for language vision models among the llama.cpp community. Llama.cpp is the only hope to run newer language vision models on Apple Silicon. Especially since flash attention python library is not available for Apple Sillicon, you can't even run inference using Torch with MPS support. :(

Please internVL,. In my tests it works better than CogVLM. Especially for stuff like receipts and documents.

opisaac9001 avatar Jun 07 '24 21:06 opisaac9001

InternVL is quite good. Benchmarks, HF, Demo.

fzzylogic avatar Jun 09 '24 12:06 fzzylogic

how about now? any update?

DoiiarX avatar Jun 17 '24 10:06 DoiiarX

upvote for this

James4Ever0 avatar Jun 23 '24 10:06 James4Ever0

InternLM-XComposer-2.5-7b is out now out and having only tested the image capabilities, it seems great. HF, Demo.

fzzylogic avatar Jul 06 '24 10:07 fzzylogic

This would be great!

KOG-Nisse avatar Jul 08 '24 12:07 KOG-Nisse

Any status on this. this is currently highest performing Vision LLM from user's tests on LocalLLama reddit.

v3ss0n avatar Jul 09 '24 16:07 v3ss0n

Any updates?

suncloudsmoon avatar Jul 23 '24 08:07 suncloudsmoon

嘿,我有几个项目很忙,它在我的清单上,但目前优先级并不多。这真的只是我可以在业余时间做的事情

I tested the now available InternVL2 model and it is indeed a great choice, I hope to give it a higher priority, thank you for your hard work.

CNEA-lw avatar Jul 25 '24 08:07 CNEA-lw

InternVL2 would be great to have! Seems to be SOTA in open source vision LLMs

goto-loop avatar Jul 29 '24 07:07 goto-loop

Any thoughts on this? Since Vision models varies alot , compare to LLM models do Maintainers thinks LLamacpp should be focusing on supporting it? Since there are already a lot of LLM models coming out and the core team is doing tremendous work on those already. Do core team feels VLMs should be supported outside of llamacpp project? May be addon/extention architecture viable?

v3ss0n avatar Jul 29 '24 19:07 v3ss0n

This would be a gamechanger! @cjpais

Backendmagier avatar Aug 05 '24 13:08 Backendmagier

I'm sorry I don't know when I can do this, I have a huge backlog of projects I'm currently working on! I am very curious to try it but unfortunately it's not very high priority for me right now

cjpais avatar Aug 05 '24 15:08 cjpais

InternVL2 would be great to have! Seems to be SOTA in open source vision LLMs

+1

nogifeet avatar Aug 18 '24 14:08 nogifeet

I think model builder should contribute their vision model works in here.

v3ss0n avatar Aug 20 '24 11:08 v3ss0n

I think model builder should contribute their vision model works in here.

In an ideal situation,it's model builder's work! but sadly, maybe their work not focus on device,or they have self-deploy server framework,such as LMDeploy.

So, I really hope llama.cpp contributor can support this model, it is really good!

felixslu avatar Aug 21 '24 12:08 felixslu

I think the devs can add their own branches to the llama.cpp repo or huggingface.co? The 2.5 version of InternVL also got released..can take it a try for transfer as a helper if needed.

ZhongQiyu avatar Sep 10 '24 12:09 ZhongQiyu

I think model builder should contribute their vision model works in here.

In an ideal situation,it's model builder's work! but sadly, maybe their work not focus on device,or they have self-deploy server framework,such as LMDeploy.

If they want to be popular and used by many , that would be the case.

LMDeploy is full of bufferoverflow crashes , not recommended for any secure deployment.

v3ss0n avatar Sep 11 '24 09:09 v3ss0n