lmdeploy icon indicating copy to clipboard operation
lmdeploy copied to clipboard

[Feature] Support for LLaVA-NeXT Qwen1.5-110, Qwen1.5-72B, LLaMA3-8B

Open Iven2132 opened this issue 1 year ago • 13 comments

Motivation

It outperforms existing open-source models like Intern-VL-1.5

https://llava-vl.github.io/blog/2024-05-10-llava-next-stronger-llms/#:~:text=Live%20Demo-,Benchmark%20Results,-Results%20with%20LMMs

Related resources

https://llava-vl.github.io/blog/2024-05-10-llava-next-stronger-llms/

Additional context

No response

Iven2132 avatar May 11 '24 06:05 Iven2132

请问现在是否已经支持LLaVA-NeXT Qwen1.5-110, Qwen1.5-72B, LLaMA3-8B了吗

White-Friday avatar May 13 '24 04:05 White-Friday

Yes. They are supported.

lvhan028 avatar May 13 '24 05:05 lvhan028

It probably has problem when the model is large. #1563 explains the reason

lvhan028 avatar May 13 '24 06:05 lvhan028

Yes. They are supported.

@lvhan028 So I can use the LLaVA-NeXT Qwen1.5-72B, LLaMA3-8B?

Iven2132 avatar May 13 '24 10:05 Iven2132

Sorry, my bad. It probably needs to make some changes like PR #1579 does. I didn't find the checkpoints. Could you share the huggingface repo_id?

lvhan028 avatar May 13 '24 11:05 lvhan028

Sorry, my bad. It probably needs to make some changes like PR #1579 does. I didn't find the checkpoints. Could you share the huggingface repo_id?

lmms-lab/llama3-llava-next-8b: https://huggingface.co/lmms-lab/llama3-llava-next-8b

lmms-lab/llava-next-72b: https://huggingface.co/lmms-lab/llava-next-72b

lmms-lab/llava-next-110b: https://huggingface.co/lmms-lab/llava-next-110b

Please add system prompt feature too

Iven2132 avatar May 13 '24 14:05 Iven2132

Do you know what minimum GPU memory requirements would be to serve these VL models? Thanks!

babla9 avatar May 13 '24 19:05 babla9

Do you know what minimum GPU memory requirements would be to serve these VL models? Thanks!

A100s are best if you want you can run these on Modal, they are giving $30 free credits

Iven2132 avatar May 14 '24 04:05 Iven2132

Do you know what minimum GPU memory requirements would be to serve these VL models? Thanks!

A100s are best if you want you can run these on Modal, they are giving $30 free credits

Thanks, for LLaVA, do you know what the minimum configuration would be? eg 1 x A100 40gb? Would V100s work (eg 2xV100)? Anyway I can serve quantized models using V100s via lmdeploy?

babla9 avatar May 15 '24 06:05 babla9

Do you know what minimum GPU memory requirements would be to serve these VL models? Thanks!

A100s are best if you want you can run these on Modal, they are giving $30 free credits

Thanks, for LLaVA, do you know what the minimum configuration would be? eg 1 x A100 40gb? Would V100s work (eg 2xV100)? Anyway I can serve quantized models using V100s via lmdeploy?

I'm not sure but you can run the llava-next 8b model with one A100

Iven2132 avatar May 15 '24 07:05 Iven2132

@lvhan028 Will it support Llava-next?

Iven2132 avatar May 16 '24 10:05 Iven2132

Regarding llava-next, models in https://huggingface.co/collections/liuhaotian/llava-16-65b9e40155f60fd046a5ccf2 have already been supported except https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b

We will support the following in June

lmms-lab/llama3-llava-next-8b: https://huggingface.co/lmms-lab/llama3-llava-next-8b
lmms-lab/llava-next-72b: https://huggingface.co/lmms-lab/llava-next-72b
lmms-lab/llava-next-110b: https://huggingface.co/lmms-lab/llava-next-110b

lvhan028 avatar May 16 '24 12:05 lvhan028

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

github-actions[bot] avatar May 24 '24 02:05 github-actions[bot]

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.

github-actions[bot] avatar May 29 '24 02:05 github-actions[bot]