LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

LAVIS - A One-stop Library for Language-Vision Intelligence

Results 378 LAVIS issues
Sort by recently updated
recently updated
newest added

![image](https://github.com/user-attachments/assets/572c97fe-e095-416a-96c2-720181e90720) I am using blip2's feature extraction, however for the `feature_text`, the text_embedding shape is [1,6,768] instead of [1,12,768], I am confused if this is the case or I did...

A module that was compiled using NumPy 1.x cannot be run in NumPy 2.0.2 as it may crash. To support both 1.x and 2.x versions of NumPy, modules must be...

Thank you for open-sourcing such a cool project of BLIP-3. I found that there are some codes that are not provided in the `xgen-mm` branch, specifically, `open_flamingo/train/data` and `open_flamingo/train/data_utils`. I...

i come to this problem while loading this model: model, vis_processors, text_processors = load_model_and_preprocess("blip2_image_text_matching", "pretrain", device=device, is_eval=True) File "/home/user/anaconda3/envs/lavis/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1825, in from_pretrained return cls._from_pretrained( File "/home/user/anaconda3/envs/lavis/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1988, in...

I want to use BLIP-2 in downstream tasks which only supporting numpy version==1.23, however blip2 in lavis needs numpy version>=2.0, I wonder if there is anyway to extract the blip-2...

when trying to finetune BLIP2 with `caption_coco_ft.yaml`, I got the following error: ``` File "/data/a/bowenz/LAVIS/lavis/tasks/base_task.py", line 222, in _train_inner_loop loss, loss_dict = self.train_step(model=model, samples=samples) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/a/bowenz/LAVIS/lavis/tasks/base_task.py", line 64, in...

i wanna adapt gradCAM in Huggingface VQA Models like this: [input]: image, query [output]: answer then, when model generate answer, can i create gradCAM for per answer tokens? i wanna...

I am trying to improve the performance of BLIP2 and InstructBLIP models by giving some pre-defined examples (user-assistant chat history), how can I do this?

root@autodl-container-7bda11a2fa-647577f7:~/LAVIS# python tests/models/test_instructblip.py /root/miniconda3/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/root/miniconda3/lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise,...

我只想用blip2的第一个阶段,主要就是用Q-former,然后用自己的数据集,我应该运行哪些文件呢?没太明白代码的结构,谢谢!