Orr Zohar issues

Results 26 issues of


                                            Orr Zohar

[BUG] zero3 hang during inference, need to detach part of computational graph, .detach()/torch.no_grad do not work.

**Describe the bug** I am training a video-llm model, where I encode log videos with a varying number of forward passes to avoid OOM issues. I would like to use...

bug

training

[REQUEST] ZeRO3 doc - support for wrapping model sub-components seperately for training

**Is your feature request related to a problem? Please describe.** it is very difficult to train MM models (e.g., multi-image chat/video chat) models in zero3 because the effective ``vision batch``>>``text...

enhancement

internvideo2 distilled models config

Hi, For the InternVideo2-S/B/L encoders: what value was used for `sep_image_video_pos_embed`? It seems like this was set true in the 1b/6b models, but false in S/B/L I am trying to...

Sequence Parallel logic -- why are you padding with '#'?

Hi, When you do Sequence Paralle -- you are padding with token id 2 = '#' https://github.com/NVlabs/VILA/blob/2b43308f25e63161a172fe9a38e3a04e2fcd12ef/llava/data/dataset.py#L1372-L1389 Could you let me know why you are padding with this instead of...

raw evaluations for proprietary models

Hi, Will you make open-source/can you share the raw evaluations for proprietary models? Best, Orr

raw evaluations of proprietary models

Hi, Will you make open-source/can you share the raw evaluations for proprietary models? Best, Orr