LAVIS issues

Results 282 LAVIS issues

Sort by recently updated

BLIP-2 predicting answers outside the specified answer candidates

Hello, I am using the blip2_t5 model (model_type="pretrain_flant5xxl") to predict answers for a given input. I provide a list of answer candidates to the model, but the model still predicts...

sergej-d

We want to run this on cpu , which model is best to use for image to text analysis

Our server is running on Ubuntu 22.04 , and 32 core vcpu, 64 GiB RAM. We have a base coco model, and it was ok, not very accurate . We...

mithun40

！？samples["text_output"] in training of BLIP2_flan_pretrain_stage2

![image](https://user-images.githubusercontent.com/43492238/229017598-a20120fc-4592-4cc3-a301-c61080d5504d.png) In the training of encoder-decoder-based LLM（e.g. FlanT5）, Prefix Text(i.e. sample[‘text_input] in code) and Suffix Text(i.e. sample['text_output'] in code) are needed. However, all the datasets in this repo don't have...

Bazinga699

Regression in the blip_caption / base_coco downloads?

It looks like blip_caption / base_coco models were updated and no longer compatible with the code? Or is it a bug on my side? I'm trying a cog container (that...

dchichkov

[Blip2] Image Color space

Firstly, thank you for the amazing VQA model! Is the color space of BLIP-2's image BGR or RGB? When doing VQA, it identifies blue buttons as red buttons, and yellow...

zhangj1an

[BLIP2] Model names typo

https://github.com/salesforce/LAVIS/blob/47e0f3f25ca763975738c7224c8369207812ce6c/lavis/models/blip2_models/blip2_opt.py#L24-L25 The model names should be pretrain_opt2.7b and pretrain_opt6.7b

kondvit

How to reproduce VQA finetuning results of Table -4?

Dear Authors , Could you please make the VQA finetuning codebase available to us?

sanyalsunny111

Can we inference with multiple prompts simultaneously?

model.generate() can take batch of images and 1 prompts, can we do batch of images and batch of prompts? Thanks EDIT: I want every image interactives with every prompt, so...

zhilif

Training and evaluation configs for BLIP2 on the NoCaps dataset

Great work and thanks for releasing the code and checkpoints for BLIP2. One thing we noticed missing is the training and evaluation config on the NoCaps dataset. We looked into...

albert-cwkuo

Releasing the finetuned BLIP-v2 model weights on VQAv2

Hi, is there any plan to release the finetuned BLIP-v2 model weights on VQAv2?

kevinliang888

LAVIS
LAVIS copied to clipboard

Metadata

BLIP-2 predicting answers outside the specified answer candidates

We want to run this on cpu , which model is best to use for image to text analysis

！？samples["text_output"] in training of BLIP2_flan_pretrain_stage2

Regression in the blip_caption / base_coco downloads?

[Blip2] Image Color space

[BLIP2] Model names typo

How to reproduce VQA finetuning results of Table -4?

Can we inference with multiple prompts simultaneously?

Training and evaluation configs for BLIP2 on the NoCaps dataset

Releasing the finetuned BLIP-v2 model weights on VQAv2

← Metadata

Owner

Metadata

LAVIS LAVIS copied to clipboard

Metadata

← Metadata

Owner

Metadata

LAVIS
LAVIS copied to clipboard