text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

[New Model Request] NVLM

Open nbroad1881 opened this issue 4 months ago • 0 comments

Model description

I'm creating this issue to gauge how interested people are in having the NVLM model added to TGI. If you would like to see it added, please add an emoji to this message.

Here is the announcement from Nvidia on the model card:

Today (September 17th, 2024), we introduce NVLM 1.0, a family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models (e.g., Llama 3-V 405B and InternVL 2). Remarkably, NVLM 1.0 shows improved text-only performance over its LLM backbone after multimodal training.

Open source status

  • [X] The model implementation is available
  • [X] The model weights are available

Provide useful links for the implementation

https://huggingface.co/nvidia/NVLM-D-72B

nbroad1881 avatar Oct 11 '24 17:10 nbroad1881