LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

LAVIS - A One-stop Library for Language-Vision Intelligence

Results 282 LAVIS issues
Sort by recently updated
recently updated
newest added

Hello, I am using the blip2_t5 model (model_type="pretrain_flant5xxl") to predict answers for a given input. I provide a list of answer candidates to the model, but the model still predicts...

Our server is running on Ubuntu 22.04 , and 32 core vcpu, 64 GiB RAM. We have a base coco model, and it was ok, not very accurate . We...

![image](https://user-images.githubusercontent.com/43492238/229017598-a20120fc-4592-4cc3-a301-c61080d5504d.png) In the training of encoder-decoder-based LLM(e.g. FlanT5), Prefix Text(i.e. sample[‘text_input] in code) and Suffix Text(i.e. sample['text_output'] in code) are needed. However, all the datasets in this repo don't have...

It looks like blip_caption / base_coco models were updated and no longer compatible with the code? Or is it a bug on my side? I'm trying a cog container (that...

Firstly, thank you for the amazing VQA model! Is the color space of BLIP-2's image BGR or RGB? When doing VQA, it identifies blue buttons as red buttons, and yellow...

https://github.com/salesforce/LAVIS/blob/47e0f3f25ca763975738c7224c8369207812ce6c/lavis/models/blip2_models/blip2_opt.py#L24-L25 The model names should be pretrain_opt2.7b and pretrain_opt6.7b

Dear Authors , Could you please make the VQA finetuning codebase available to us?

model.generate() can take batch of images and 1 prompts, can we do batch of images and batch of prompts? Thanks EDIT: I want every image interactives with every prompt, so...

Great work and thanks for releasing the code and checkpoints for BLIP2. One thing we noticed missing is the training and evaluation config on the NoCaps dataset. We looked into...

Hi, is there any plan to release the finetuned BLIP-v2 model weights on VQAv2?