VILA issues

About the release of VILA v1.5 technical report or blog

2

Hi, Thank you for your outstanding work! Without a doubt, your recently published VILA v1.5 series pushes the boundaries of multimodal large language models. It is arguably the most powerful...

Fr0zenCrane

Random shuffle before dropping the last few samples

I noticed a bug in the data sampler. In the original implementation, the same elements will be dropped in every epoch. For example, assume the dataset size is 900, and...

tongzhoumu

Fine tuning and --evaluation_strategy argument

1

I'm trying to get fine-tuning working through the 3_sft.sh script but am encountering an error: ``` Traceback (most recent call last): File "/root/VILA/llava/train/train_mem.py", line 36, in train() File "/root/VILA/llava/train/train.py", line...

lyluh

how to run VILA1.5-40B-AWQ

1

Please provide a script to run the VILA1.5-40b int4 quantized model. like this:

chenxinhua

Expected Release Date for VILA^2 Model and Code

1

Hello, Thank you for the amazing work you’ve done on this project. I’m particularly interested in the upcoming VILA2 model and its associated code. Could you please share any information...

SZUHvern

How to run longvila large context, sequence parallel inference?

20

There are multiple mentions of a multi modal sequence parallel system for inference which can be seamlessly integrated with HF transformers. However, I am not able to follow this through...

zadeismael

TypeError: LlamaRotaryEmbedding.forward() got an unexpected keyword argument 'seq_len' when running VILA model inference

6

Hello, I am trying to run the VILA model for inference, but I have encountered a couple of issues that I need help with. （1）FlashAttention Issue: Initially, I faced a...

LanceLeonhart

Long context video module only

Great works and research. My question is simply if is it possible to use only the visual/video part (already pretrained on video dataset like kinetics) for fine-tuning on long video...

MH-Python

Context size and examples for LongVILA

1

Hello, I'm new to LLM serving and multi-modal LLMs. I'm looking for similar examples for the LongVILA model, like the one for VILA1.5 models: ``` python -W ignore llava/eval/run_vila.py --model-path...

yulinzou

cannot download dataset

1

I found that the dataset like Efficient-Large-Model/sherlock_317K could not be downloaded now, and I got 404 when I enter it in huggingface's datasets.

henrycjh

VILA
VILA copied to clipboard

Metadata

About the release of VILA v1.5 technical report or blog

Random shuffle before dropping the last few samples

Fine tuning and --evaluation_strategy argument

how to run VILA1.5-40B-AWQ

Expected Release Date for VILA^2 Model and Code

How to run longvila large context, sequence parallel inference?

TypeError: LlamaRotaryEmbedding.forward() got an unexpected keyword argument 'seq_len' when running VILA model inference

Long context video module only

Context size and examples for LongVILA

cannot download dataset

← Metadata

Owner

Metadata

VILA VILA copied to clipboard

Metadata

← Metadata

Owner

Metadata

VILA
VILA copied to clipboard