LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

LAVIS - A One-stop Library for Language-Vision Intelligence

Results 299 LAVIS issues
Sort by recently updated
recently updated
newest added

Hello, I am currently in the process of evaluating the Blip2 model for one of my use cases, where I need to assess the similarity between text and images. For...

In Vicuna-7b-v1.1's config.json, there is : ``` "bos_token_id": 0, "eos_token_id": 1, "pad_token_id": -1, ``` In its generation_config.json, there is: ``` "bos_token_id": 1, "eos_token_id": 2, "pad_token_id": 0, ``` But actually, this...

Hello, thanks to your great work! In `blip2_vicuna_instruct.py`, the `bos_token` of LLM has been changed. Originally, it is '< s >' with idx:1. But after the following code: ``` self.llm_tokenizer.add_special_tokens({'pad_token':...

As a reminder, I find that the config of [eachadea/vicuna-7b-1.1](https://huggingface.co/eachadea/vicuna-7b-1.1/tree/main) and [lmsys-vicuna-7b-v1.1](https://huggingface.co/lmsys/vicuna-7b-v1.1) are different, i.e. they have different bos_token_id, eos_token_id, and pad_token_id, and only eachadea/vicuna-7b-1.1 can work well with instructBLIP....

Dear authors, Thank you for your great work, InstructBLIP ! I'd like to train InstructBLIP with my own instruction data. Could you provide example data file, or data generation code?...

Thanks for your wonderful work. I try to pre-train instrcutBLIP from scratch on 4x4 A100. However, the GPU memory is slowly increasing as the training progresses, which leads to OUT-OF-MEMORY...

Dear Author, I am currently running BLIP2 Instruct, the code really helps, but I only have 2 3090s avaiable, would you please consider updating the version to support multi-gpus? Thanks

in the BLIP-2 paper, "We propose Q-Former as the trainable module to bridge the gap between a frozen image encoder and a frozen LLM. It extracts a fixed number of...

Hi, Thanks for the repository and codes. I'd like to run the stylization notebook from [here](https://github.com/salesforce/LAVIS/blob/main/projects/blip-diffusion/notebooks/stylization.ipynb). When calling it, I receive the following error: ```python import torch import numpy as...

When I test instructed zero-shot vision-to-language generation, I get this kind of output. Can anybody tell me what's wrong? ['10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000'] The model I used is : model, vis_processors, _ =...