LAVIS issues

Results 282 LAVIS issues

Sort by recently updated

[BLIP2] How to perform stage 1 Vision-Language Representation bootstraping

I think that stage 1 learning, that means visual-language representation learning with those three objectives mentioned in the article is not yet implemented. Am I right? Not implemented `load_pretrained: False`...

klima7

Unable to train with "clip_L" VIT option (or other generic VIT)

I tried to train with clip_L vision encoder, (by adding vit_model: "clip_L" to model train config), but it seems the QFormer checkpoint loaded by default at this line (https://github.com/salesforce/LAVIS/blob/main/lavis/models/base_model.py#L100) is...

kttian

transformers 4.27 compatability

I have to use transformers 4.27 because latest version of clip-interrogator requires that specific version. After upgrading transformers from 4.26 to 4.27, I had this issue. ``` ╭─────────────────────────────── Traceback (most...

gunesevitan

[BLIP2]: Low accuracy of zeroshot VQA of BLIP2-opt-2.7b

Hi, Thank you for your great work BLIP2. I find there is no zeroshot VQA evaluation code for BLIP2-OPT, so I create one, refering to the code of FLAN-T5. However,...

YuanLiuuuuuu

fix on blip_caption.py: prepare prompt for beam search, aovid tensor …

…mismatch in later XMLBert decoder. fix issue #241

Luodian

cla:signed

Can BLIP2 work with small batch size?

Can BLIP2 work with small batch size like 64,128,256?

yxchng

How to use vis_processors without going to CPU memory?

Is it possible to do this without going from GPU -> CPU -> GPU? Code: ```py vis_images = [self.vis_processors["eval"](Image.fromarray(image.cpu().numpy())).unsqueeze(0).to('cuda').squeeze(0) for image in images] features_image = self.blip2_model.extract_features({"image": torch.stack(vis_images)}, mode="image") ``` extract_features...

Suhail

use_dist_eval_sampler arguments in retrieval evaluation

When I set use_dist_eval_sampler=True in retrieval evaluation, the result is very bad. But when I set use_dist_eval_sampler False, the result is very good

vanpersie32

How to use BLIP2 finetuned model

I train a finetune model use command: `python train.py --cfg-path lavis/projects/blip2/train/pretrain_stage2.yaml` my env is ![image](https://user-images.githubusercontent.com/4124006/229704320-75e5e6ef-9c30-4a23-8261-ece7a2fc7638.png) but when i use finetuned model to generate caption, the error happend `RuntimeError: Sizes of...

luozhiping

AttributeError: 'NoneType' object has no attribute 'getstate'

I tried to run the code in huggingface (https://huggingface.co/Salesforce/blip2-opt-2.7b ) ```python import requests from PIL import Image from transformers import BlipProcessor, Blip2ForConditionalGeneration processor = BlipProcessor.from_pretrained("Salesforce/blip2-opt-2.7b") ``` However, I meet this...

fly-dragon211

LAVIS
LAVIS copied to clipboard

Metadata

[BLIP2] How to perform stage 1 Vision-Language Representation bootstraping

Unable to train with "clip_L" VIT option (or other generic VIT)

transformers 4.27 compatability

[BLIP2]: Low accuracy of zeroshot VQA of BLIP2-opt-2.7b

fix on blip_caption.py: prepare prompt for beam search, aovid tensor …

Can BLIP2 work with small batch size?

How to use vis_processors without going to CPU memory?

use_dist_eval_sampler arguments in retrieval evaluation

How to use BLIP2 finetuned model

AttributeError: 'NoneType' object has no attribute 'getstate'

← Metadata

Owner

Metadata

LAVIS LAVIS copied to clipboard

Metadata

← Metadata

Owner

Metadata

LAVIS
LAVIS copied to clipboard