VILA icon indicating copy to clipboard operation
VILA copied to clipboard

VILA as server not working

Open Eyshika opened this issue 7 months ago • 6 comments

I am trying to start NVILA server for 15B, but it has lots of bugs and the latest one is not able to take text and image together. I see this error on Client side

openai.InternalServerError: Error code: 500 - {'error': "'NoneType' object is not iterable"}

and logs on server side with added debug logs

[DEBUG] images_tensor type: <class 'torch.Tensor'>
[DEBUG] model expects vision tower: n/a
[DEBUG] model.generate kwargs: images=<class 'list'>, text len=187
INFO:     107.217.97.155:58591 - "POST /chat/completions HTTP/1.1" 500 Internal Server Error

The problem seems to come from model.generate which only takes image

Eyshika avatar Apr 25 '25 16:04 Eyshika

how you launch server?

Lyken17 avatar Apr 27 '25 01:04 Lyken17

@Lyken17 Am using the command provided in README

python -W ignore server.py \
    --port 8000 \
    --model-path Efficient-Large-Model/NVILA-15B \
    --conv-mode auto

Initially it gave error for auto then I switched to vicuna_1 now it gives error in taking image and text together and throws error

 Error code: 500 - {'error': "'NoneType' object is not subscriptable"}

Eyshika avatar Apr 28 '25 13:04 Eyshika

I have a PR with fixes @Lyken17 https://github.com/NVlabs/VILA/pull/235

Eyshika avatar Apr 30 '25 06:04 Eyshika

awesome fixing! just quickly go through the fix. Why we still need vicuna_v1? The scripts are supposed to run with NVILA series of checkpoints only.

Lyken17 avatar Apr 30 '25 15:04 Lyken17

vicuna_v1 is what I tried NVILA-15B with, since auto has bugs. I don't know whats the best one for NVILA-15B but that's the one seemed most accurate to be used as user assistant. Also we can use any its just set as default when nothing is there.

Eyshika avatar May 01 '25 04:05 Eyshika

@Lyken17 Isnt vicunna_v1 a way to design conversation for VILA? if not, then how is it supposed to be taking checkpoints? auto doesn't work and that also uses vicunna as default. All the eval code is also using vicunna. So am confused by your statement

Eyshika avatar May 01 '25 04:05 Eyshika