LLaMA-Adapter icon indicating copy to clipboard operation
LLaMA-Adapter copied to clipboard

Multi-image inputs to the model

Open vishaal27 opened this issue 1 year ago • 1 comments

Hi, I was wondering if it is possible to prompt the model with more than one image input since in the implementation the incorporation of the visual tokens is a simple addition to the adapter layer tokens (https://huggingface.co/spaces/csuhan/LLaMA-Adapter/blob/48d8b02c0c335145b8b3d1ca7162ac42979bec93/llama/model.py#L357)? Have you tried incorporating multiple image inputs by adding more than one set of visual tokens to the adapters?

vishaal27 avatar Apr 23 '23 12:04 vishaal27