VILA issues

Sequence Parallel logic -- why are you padding with '#'?

Hi, When you do Sequence Paralle -- you are padding with token id 2 = '#' https://github.com/NVlabs/VILA/blob/2b43308f25e63161a172fe9a38e3a04e2fcd12ef/llava/data/dataset.py#L1372-L1389 Could you let me know why you are padding with this instead of...

orrzohar

Is there any DataCollatorForSupervisedDatasetSeqParallel for video dataset?

1

The current DataCollatorForSupervisedDatasetSeqParallel in llava/data/dataset.py is built for image dataset. There will be many errors when directly using it for video dataset. Will you release the similar solution for video...

ymbao97

How to train a video inference model using this framework?

1

I want to train a multimodal video understanding model. What should I do? I find the NVILA-15B model supports video inference.

HAOYON-666

Evaluating needle in a haystack the responses are abnormal

1

Hello, Author. When I changed the question in the "searching for a needle in the haystack" evaluation from one about the needle to a different question (for example, "please describe...

hshc123

Error when running NVILA-8B-Video

1

When I evaluated NVILA-8B-Video on lmms-longvideobench with this script: ```bash #!/bin/bash set -e MODEL_NAMES=( "NVILA-8B-Video" ) SELECTED_TASKS=( "lmms-longvideobench_val_v" ) TASK_STR=$( IFS=, echo "${SELECTED_TASKS[*]}" ) echo "TASK_STR: $TASK_STR" START_TIME=$(date +%s) echo...

yuanyehome

mobassir94

Inquiry about MoE (Mixture of Experts) Training Support

1

Hello VILA team! First, thank you for open-sourcing this incredible family of Vision Language Models! The work on VILA, NVILA, and is truly impressive, and the focus on efficiency and...

dyyoungg

VILA
VILA copied to clipboard

Metadata

Sequence Parallel logic -- why are you padding with '#'?

Is there any DataCollatorForSupervisedDatasetSeqParallel for video dataset?

How to train a video inference model using this framework?

Evaluating needle in a haystack the responses are abnormal

Error when running NVILA-8B-Video

Fix leaf Variable in-place operation error by using out-of-place addi…

Is there any news about the serving scripts updates?

Confusion about answer_embeds Usage in eval_forward for Inference

Abnormal Inference Time and Repetitive Summary with Efficient-Large-Model/LongVILA-R1-7B on Specific Video Chunk

Inquiry about MoE (Mixture of Experts) Training Support

← Metadata

Owner

Metadata

VILA VILA copied to clipboard

Metadata

← Metadata

Owner

Metadata

VILA
VILA copied to clipboard