VILA icon indicating copy to clipboard operation
VILA copied to clipboard

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Results 141 VILA issues
Sort by recently updated
recently updated
newest added

If we use the VILADistributedSampler (https://github.com/Efficient-Large-Model/VILA/blob/main/llava/train/llava_trainer.py#L274-L281) for Distributed Training, should the `gradient_accumulation_steps` be hardcoded to 1? Since I notice that when I use 4 nodes (8 GPUs per node) to...

Hi, thanks for the great work! May I know if we have access to non-instruct models? (e.g., models after stage-1 or stage-2). For my specific research use case, I would...

As mentioned in the paper (see Fig. 3 on page 3), the question-answer pairs were generated by a Large Language Model (LLM). What prompts were used by the LLM to...

Hi, Thanks for releasing the fantastic works. Will you release the scripts or instructions for the datasets used at Stages 4 and 5 in LONGVILA (e.g., new dataset from Shot2Story)?...

Hi! very impressed by your work with LongVILA. I would like to do long context with Qwen/LLaMA3.1, but currently, i only see support for Mixtral. Any chance you plan on...

Is it possible to finetune VILA through hugging face with a custom image dataset? I don't see any documentation about this.

Hi Yaolug We have been working on how to train VILA in Jetson Orin and it's turns out that I have updated my ORIN to Jetpack 6 enviroment. I don't...

Do you have any plan to get involved in OpenVLM leaderboard? https://huggingface.co/spaces/opencompass/open_vlm_leaderboard I think that needs some efforts from your side, but given the performance of VILA provides you good...

Hi, I am trying to use the VILA1.5-40b model for doing video inference. Below is the prompt I am using: ```bash python -W ignore llava/eval/run_vila.py \ --model-path Efficient-Large-Model/VILA1.5-40b \ --conv-mode...

Hi, I am encountering an issue when running inference on the Llama-3-VILA1.5-8B model. The error message I receive is: ```RuntimeError: FlashAttention only supports Ampere GPUs or newer.``` I am using...