InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

Convert to Gguf format to work with Llama.cpp?

Open chigkim opened this issue 1 year ago • 3 comments

Llava has various quantized models in gguf format, so it can be used with Llama.cpp. https://github.com/ggerganov/llama.cpp/pull/3436 Is this possible?

chigkim avatar Jan 12 '24 21:01 chigkim

Hi, thank you for your suggestion. I will add compatibility with community tools to my to-do list.

czczup avatar Jan 16 '24 14:01 czczup

gguf format is good for ollama users, Any Update?

leeaction avatar May 07 '24 03:05 leeaction

It will be nice to have this model in gguf format in ollama.

GHOST1834 avatar May 19 '24 02:05 GHOST1834

any updates on this? the 4b intern model is killer for its size! would love to see it supported with llama.cpp

nischalj10 avatar Jun 06 '24 13:06 nischalj10

Would love internvl-chat-v1-5 in a gguf format! https://internvl.opengvlab.com/

KOG-Nisse avatar Jun 26 '24 07:06 KOG-Nisse

I second this

thomas-rooty avatar Jun 29 '24 10:06 thomas-rooty

@ErfeiCui why did you close this as completed?

orabazes avatar Jul 11 '24 15:07 orabazes

Any update on this? InternVL2-Llama3-76B on Ollama/llama.cpp would be amazing!

chigkim avatar Aug 11 '24 17:08 chigkim

If someone gives me a tutorial I will write my own code to tranform this for pytorch to gguf for llama.cpp myself

kim-gtek avatar Aug 12 '24 19:08 kim-gtek

It's more involved. you have to implement the model architecture and image preprocessing logic to llama.cpp which uses C++.

chigkim avatar Aug 12 '24 22:08 chigkim