CLIP-ReID Will L2 normalization for image and text leads to better results?

Will L2 normalization for image and text leads to better results?

Open FranklinLingfeng opened this issue 1 year ago • 0 comments

When aligning image and text, why don't you need to l2 normalize the image and text features? Will this not cause the module length of the image feature to become very large in order to reduce the i2t loss in the second stage of training?

Feb 19 '24 13:02 FranklinLingfeng

CLIP-ReID CLIP-ReID copied to clipboard

Will L2 normalization for image and text leads to better results?

CLIP-ReID
CLIP-ReID copied to clipboard