LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

The Problem on blip2_feature_extraction Example

Open xcxhy opened this issue 2 years ago • 1 comments

Hello author, in the example/blip2_feature_ectraction.ipynb file, it seems that is not no blip2_feature_extractor model file mentioned in your code. I can't find the model in the model zoo, should I use the blip2 base model in model zoo directly, or continue to use blip_feature_extractor for encoding?

xcxhy avatar Feb 27 '23 06:02 xcxhy

Hi @xcxhy, did you install from PyPI? If so, can you try with the newest release 1.0.2: https://pypi.org/project/salesforce-lavis/.

The feature extraction was not supported in 1.0.0

dxli94 avatar Mar 06 '23 14:03 dxli94

@dxli94 Thank you for your response, just went to test, the problem is solved. I have some code details, how long is the longest BLIP2 encoded text?

xcxhy avatar Mar 07 '23 02:03 xcxhy

@xcxhy, Qformer adapts a BERT underhood. It can process up to 512 tokens. However, texts in image-caption training data are usually much shorter. So you may expect better representation on shorter text than lengthy ones.

dxli94 avatar Mar 07 '23 03:03 dxli94

May I ask the BLIP and BLIP2 feature extraction models, specifically what parameters the model corresponds to.

cpperrpr avatar Aug 07 '23 02:08 cpperrpr