LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

LAVIS - A One-stop Library for Language-Vision Intelligence

Results 282 LAVIS issues
Sort by recently updated
recently updated
newest added

Hi I'm using blip-2 and the loading of the models into GPU (weights cached) works very slowly `AutoModelForCausalLM.from_pretrained('facebook/opt-6.7b', cache_dir=".") ` `load_model_and_preprocess(name="img2prompt_vqa", model_type="base", is_eval=True, device=device)` each take a few minutes. Is...

According to the transformer in Huggingface, beam-search multinomial sampling can be implemented by setting `num_beams>1` and `do_sample=True`. However, this is not supported in LAVIS. If I set `num_beams=4, num_return_sequences=4` and...

bug

Dear LAVIS team, As part of a project, we are trying to fine-tune BLIP Retrieval with a custom dataset on 2 RTX-3090 24GB GPUs. 1) We are getting the following...

I was trying to reproduce results with BLIP on VQAv2 test-dev and I observed a non-negligible difference between the VQA accuracy obtained using the [published checkpoint](https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_vqa_capfilt_large.pth) (**77.41%**) and the number...

sub.json is organized in the format: [{'image': '4385058960_b0f291553e.jpg', 'caption': 'a wooden chair in the living room', 'url': 'http://static.flickr.com/2723/4385058960_b0f291553e.jpg'}, ...} but the downloaded sbu_images.rar is extracted as: 0000/ 0001/ 0002/ 0003/...

bug

The requirements file was not updated, casing a few small issues when running `pip install -e .` and trying to run the demo. 1. `torchvision` version is not specified, thus...

cla:missing

Apart from `spacy` VQA also requires `en_core_web_sm ` model. This PR is meant to be merged alongside with https://github.com/salesforce/LAVIS/pull/75, but is separated from it since it's a bit hacky. Still...

cla:signed

Hi, Thank you for the great work in publishing this repository. I'm trying to evaluate CLIP in text to image retrieval on flickr30k by running `evaluate.py` with `--cfg-path lavis/projects/clip/exp_flickr_ret_eval.yaml`. However,...

accomodate -> accommodate

cla:missing