LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

LAVIS - A One-stop Library for Language-Vision Intelligence

Results 282 LAVIS issues
Sort by recently updated
recently updated
newest added

The "BLIP-2 ViT-G Flan-T5-XXL" has 12.1B parameters. Therefore, how do you load this model to cuda memory? Do you use some skills like model sharding or the framework like Deepspeed?

I want to use some of the models in a Kaggle code competition, so I'll upload them to Kaggle and load from a path in an offline environment. I'm not...

If we run training resuming from epoch N and max epoch is

When i try run_scripts/blip2/train/train_caption_coco.sh, it turn out to be an error: ` File "/home/jovyan/environment/anaconda3/body/envs/lavis/lib/python3.8/os.py", line 223, in makedirs mkdir(name, mode) PermissionError: [Errno 13] Permission denied: '/export'` I did not find...

``` 023-03-06 00:57:00,090 [INFO] Start training epoch 0, 8855 iters per inner epoch. Train: data epoch: [0] [ 0/8855] eta: 17:04:42 lr: 0.000000 loss: 0.7002 time: 6.9432 data: 0.0000 max...

AssertionError: BLIP models are not compatible with transformers>=4.27, run pip install transformers==4.25 to downgrade GPU: RTX A6000 CUDA version: 11.7 conda version: 23.3.1

I just want to know if the LAVIS support to finetune the CLIP modal for retrieval task, such as coco or flickr30k dataset? I see that there is nothing about...

Thank you for your outstanding work. 1. I noticed that the paper mentions "For ScienceQA, we only evaluate the set with image context." Does this mean that the hint or...

Hello! I'm trying to run Vicuna InstructBLIP, but sadly, I can't make it work. I installed LAVIS directly from your repo following the step 3 of the [installation](https://github.com/salesforce/LAVIS#installation) guide, and...

In the BLIP-2 paper, it is specified that: "[Q-Former] _extracts a fixed number of output features from the image encoder, independent of input image resolution._". However, when using cross-attention, this...