LAVIS issues

Results 282 LAVIS issues

Sort by recently updated

What is the distribuetd framework when training the model named BLIP-2 ViT-G FlanT5xxl?

The "BLIP-2 ViT-G Flan-T5-XXL" has 12.1B parameters. Therefore, how do you load this model to cuda memory? Do you use some skills like model sharding or the framework like Deepspeed?

qiao1025566574

load_model without internet connection

I want to use some of the models in a Kaggle code competition, so I'll upload them to Kaggle and load from a path in an offline environment. I'm not...

gunesevitan

Minor degenerate edge case when resuming training

If we run training resuming from epoch N and max epoch is

kttian

Blip2 Help! When trying the run_scripts/blip2/train/train_caption_coco.sh

When i try run_scripts/blip2/train/train_caption_coco.sh, it turn out to be an error: ` File "/home/jovyan/environment/anaconda3/body/envs/lavis/lib/python3.8/os.py", line 223, in makedirs mkdir(name, mode) PermissionError: [Errno 13] Permission denied: '/export'` I did not find...

lxianl455

COCO Finetuning FlanT5 model loss is converge too fast, weird due to bfloat

``` 023-03-06 00:57:00,090 [INFO] Start training epoch 0, 8855 iters per inner epoch. Train: data epoch: [0] [ 0/8855] eta: 17:04:42 lr: 0.000000 loss: 0.7002 time: 6.9432 data: 0.0000 max...

SangbumChoi

Got the AssertionError message.

AssertionError: BLIP models are not compatible with transformers>=4.27, run pip install transformers==4.25 to downgrade GPU: RTX A6000 CUDA version: 11.7 conda version: 23.3.1

iceleaf97tech

About the finetuned support of the CLIP modal on retrieval task

I just want to know if the LAVIS support to finetune the CLIP modal for retrieval task, such as coco or flickr30k dataset? I see that there is nothing about...

Pefect96

zero-shot accuracy of instructblip on ScienceQA

Thank you for your outstanding work. 1. I noticed that the paper mentions "For ScienceQA, we only evaluate the set with image context." Does this mean that the hint or...

tgyy1995

How to run InstructBLIP

Hello! I'm trying to run Vicuna InstructBLIP, but sadly, I can't make it work. I installed LAVIS directly from your repo following the step 3 of the [installation](https://github.com/salesforce/LAVIS#installation) guide, and...

ouhenio

Use pretrained Q-Former with multiple image resolutions

In the BLIP-2 paper, it is specified that: "[Q-Former] _extracts a fixed number of output features from the image encoder, independent of input image resolution._". However, when using cross-attention, this...

david-az

LAVIS
LAVIS copied to clipboard

Metadata

What is the distribuetd framework when training the model named BLIP-2 ViT-G FlanT5xxl?

load_model without internet connection

Minor degenerate edge case when resuming training

Blip2 Help! When trying the run_scripts/blip2/train/train_caption_coco.sh

COCO Finetuning FlanT5 model loss is converge too fast, weird due to bfloat

Got the AssertionError message.

About the finetuned support of the CLIP modal on retrieval task

zero-shot accuracy of instructblip on ScienceQA

How to run InstructBLIP

Use pretrained Q-Former with multiple image resolutions

← Metadata

Owner

Metadata

LAVIS LAVIS copied to clipboard

Metadata

← Metadata

Owner

Metadata

LAVIS
LAVIS copied to clipboard