LAVIS
LAVIS copied to clipboard
The Problem on blip2_feature_extraction Example
Hello author, in the example/blip2_feature_ectraction.ipynb file, it seems that is not no blip2_feature_extractor model file mentioned in your code. I can't find the model in the model zoo, should I use the blip2 base model in model zoo directly, or continue to use blip_feature_extractor for encoding?
Hi @xcxhy, did you install from PyPI? If so, can you try with the newest release 1.0.2: https://pypi.org/project/salesforce-lavis/.
The feature extraction was not supported in 1.0.0
@dxli94 Thank you for your response, just went to test, the problem is solved. I have some code details, how long is the longest BLIP2 encoded text?
@xcxhy, Qformer adapts a BERT underhood. It can process up to 512 tokens. However, texts in image-caption training data are usually much shorter. So you may expect better representation on shorter text than lengthy ones.
May I ask the BLIP and BLIP2 feature extraction models, specifically what parameters the model corresponds to.