LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

Text tokenizer difference between foward and extract_feature

Open s7ev3n opened this issue 11 months ago • 3 comments

Hi,

I notice that in blip2_qformer.py, in the forward function, the text_tokens are truncated to max_length which is 32, while in extract_feature function which to my understanding is an inference function , the text_tokens are not truncated, which could be much larger than in the training which is the forward function.

May I ask why is the difference ? I especially do not understand why text token is restricted to 32 in training.

Looking forward to the answer :) Thanks

s7ev3n avatar Aug 17 '23 01:08 s7ev3n