VILA issues

Repetitive Output in LongViLa-LLama3-1024Frames

1

LongViLa-LLama3-1024Frames output is often repetitive. Why does this happen, and are there any suggestions to reduce the repetition?

hb-jw

How to change openai inference call to docker for video?

How can I modify this to be used for video querying and description?? ```from openai import OpenAI client = OpenAI( base_url="http://localhost:8000", api_key="fake-key", ) response = client.chat.completions.create( messages=[ { "role": "user",...

cholland-nv

Got errors when using the video infer scripts

2

I run this script vila-infer \ --model-path /NVILA-8B-Video \ --conv-mode auto \ --text "Please describe the video" \ --media https://huggingface.co/datasets/Efficient-Large-Model/VILA-inference-demos/resolve/main/OAI-sora-tokyo-walk.mp4 but got this error message input = media_embeds[name].popleft() IndexError: pop...

davidpengiupui

Why finetuning NVILA-Lite-8B-stage2 needs so much memory and so slow??

2

Hi, I am using NVILA-Lite-8B-stage2 to finetune on my downstream task. The input has 8 images at most, 3 images at least. But I found that 7*A100 with zero2 can't...

goodstudent9

what is the difference between "NVILA-Lite", "NVILA" and "NVILA-video"?

13

I am very confused about the models here. https://huggingface.co/collections/Efficient-Large-Model/nvila-674f8163543890b35a91b428

MengHao666

Structured output with server

1

Hello, I am running `serving/server.py` with NVILA-Lite-8B and using openAI API to retrieve chat completions as done in [query_nvila.py](https://github.com/NVlabs/VILA/blob/main/serving/query_nvila.py). Now I want to enforce structured output, but I get: `Error...

javirk

Fine Tuning VILA1.5 with new dataset

1

Hi There I am trying to fine tune VILA1.5-3b model with a custom labeled dataset. I am using a well resourced cluster with 2 A100 GPUs and 100GB RAM on...

AidanDugganCIT

Inference on older microarchitectures-issue with FlashAttention

2

Hello everyone, thanks for this amazing work! I'm trying to run inference using NVILA-8B model on NVIDIA V100 GPU but facing issue. I understand from the model requirements that NVILA...

kldev2000

Performance Issues running NVILA

1

Hello everyone, thanks for sharing this work. I am trying to benchmark it using a different dataset/task. For now, I am more concerned about the latency numbers. ![Image](https://github.com/user-attachments/assets/40f2af11-efb8-447f-b6af-257b258126e7) I am...

UelisonSantos

VILA as server not working

6

I am trying to start NVILA server for 15B, but it has lots of bugs and the latest one is not able to take text and image together. I see...

Eyshika

VILA
VILA copied to clipboard

Metadata

Repetitive Output in LongViLa-LLama3-1024Frames

How to change openai inference call to docker for video?

Got errors when using the video infer scripts

Why finetuning NVILA-Lite-8B-stage2 needs so much memory and so slow??

what is the difference between "NVILA-Lite", "NVILA" and "NVILA-video"?

Structured output with server

Fine Tuning VILA1.5 with new dataset

Inference on older microarchitectures-issue with FlashAttention

Performance Issues running NVILA

VILA as server not working

← Metadata

Owner

Metadata

VILA VILA copied to clipboard

Metadata

← Metadata

Owner

Metadata

VILA
VILA copied to clipboard