LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

LAVIS - A One-stop Library for Language-Vision Intelligence

Results 282 LAVIS issues
Sort by recently updated
recently updated
newest added

For partial model training, e.g. BLIP2, it is not necessary to checkpoint the entire model. Saving partial weights is sufficient. This is currently not supported by the runner.

enhancement

Hey :smile: , when installing lavis via pip the clip vocab file `bpe_simple_vocab_16e6.txt.gz` is missing and therefore throwing an error `FileNotFoundError: [Errno 2] No such file or directory: '/home/maxi/ml/test-lavis/venv/lib/python3.10/site-packages/lavis/models/clip_models/bpe_simple_vocab_16e6.txt.gz'`. I...

cla:missing

Tested blip2_image_text_matching.ipynb in Colab,run at model, vis_processors, text_processors = load_model_and_preprocess("blip2_image_text_matching", "pretrain", device=device, is_eval=True) got "AttributeError: 'NoneType' object has no attribute 'from_pretrained'" . And Try : from lavis.models import model_zoo print(model_zoo)...

Currently, `train_caption_coco_large.sh` is not located on `run_scripts/lavis/blip/...` but `run_scripts/blip/...`. I found 4 cases hits, so fixed them. ``` awkrail@taichi-nishimuranoMacBook-Pro docs % ag "run_scripts/lavis" tutorial.training-example.rst 10: bash run_scripts/lavis/blip/train/train_caption_coco_large.sh tutorial.configs.rst 11: bash...

cla:signed

- i find in this repo https://github.com/salesforce/LAVIS/blob/main/lavis/configs/models/med_config.json num_attention_heads is 12. but in https://huggingface.co/Salesforce/blip-image-captioning-large/blob/main/config.json [blip_text_model] num_attention_heads is 8. - and blip_vision_model's eps should be 1e-6. https://github.com/salesforce/LAVIS/blob/2b6c6caf223e1a9a5139842d3191cad4166466b8/lavis/models/vit.py#L209

Thanks for your awesome work in BLIP-2, it displays surprising abilities when conjoining LLM and image encoder! Do you plan to release the code to pre-train such a model? We...

Hi ! I am checking out BLIP-2 and trying to caption some images with `generate`. I get an error : ``` caption = model.generate({"image": image.float()}, use_nucleus_sampling=True, num_captions=3) File "/home/groueix/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line...

Hi there, Could you please provide an example as how we should run Video Question Answering task using LAVIS? Any examples about other video-related tasks well be very appreciated.

Hi. I'm trying to use your [colab](https://colab.research.google.com/github/salesforce/LAVIS/blob/main/examples/blip2_instructed_generation.ipynb). I'm trying the most powerful model (the default in the colab): ``` model, vis_processors, _ = load_model_and_preprocess( name="blip2_t5", model_type="pretrain_flant5xxl", is_eval=True, device=device ) ```...