[Feature] Support for LLaVA-NeXT Qwen1.5-110, Qwen1.5-72B, LLaMA3-8B
Motivation
It outperforms existing open-source models like Intern-VL-1.5
https://llava-vl.github.io/blog/2024-05-10-llava-next-stronger-llms/#:~:text=Live%20Demo-,Benchmark%20Results,-Results%20with%20LMMs
Related resources
https://llava-vl.github.io/blog/2024-05-10-llava-next-stronger-llms/
Additional context
No response
请问现在是否已经支持LLaVA-NeXT Qwen1.5-110, Qwen1.5-72B, LLaMA3-8B了吗
Yes. They are supported.
It probably has problem when the model is large. #1563 explains the reason
Yes. They are supported.
@lvhan028 So I can use the LLaVA-NeXT Qwen1.5-72B, LLaMA3-8B?
Sorry, my bad. It probably needs to make some changes like PR #1579 does. I didn't find the checkpoints. Could you share the huggingface repo_id?
Sorry, my bad. It probably needs to make some changes like PR #1579 does. I didn't find the checkpoints. Could you share the huggingface repo_id?
lmms-lab/llama3-llava-next-8b: https://huggingface.co/lmms-lab/llama3-llava-next-8b
lmms-lab/llava-next-72b: https://huggingface.co/lmms-lab/llava-next-72b
lmms-lab/llava-next-110b: https://huggingface.co/lmms-lab/llava-next-110b
Please add system prompt feature too
Do you know what minimum GPU memory requirements would be to serve these VL models? Thanks!
Do you know what minimum GPU memory requirements would be to serve these VL models? Thanks!
A100s are best if you want you can run these on Modal, they are giving $30 free credits
Do you know what minimum GPU memory requirements would be to serve these VL models? Thanks!
A100s are best if you want you can run these on Modal, they are giving $30 free credits
Thanks, for LLaVA, do you know what the minimum configuration would be? eg 1 x A100 40gb? Would V100s work (eg 2xV100)? Anyway I can serve quantized models using V100s via lmdeploy?
Do you know what minimum GPU memory requirements would be to serve these VL models? Thanks!
A100s are best if you want you can run these on Modal, they are giving $30 free credits
Thanks, for LLaVA, do you know what the minimum configuration would be? eg 1 x A100 40gb? Would V100s work (eg 2xV100)? Anyway I can serve quantized models using V100s via lmdeploy?
I'm not sure but you can run the llava-next 8b model with one A100
@lvhan028 Will it support Llava-next?
Regarding llava-next, models in https://huggingface.co/collections/liuhaotian/llava-16-65b9e40155f60fd046a5ccf2 have already been supported except https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b
We will support the following in June
lmms-lab/llama3-llava-next-8b: https://huggingface.co/lmms-lab/llama3-llava-next-8b
lmms-lab/llava-next-72b: https://huggingface.co/lmms-lab/llava-next-72b
lmms-lab/llava-next-110b: https://huggingface.co/lmms-lab/llava-next-110b
This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.
This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.