CLIP about position embedding scale

about position embedding scale

Open OliverHuang1220 opened this issue 1 year ago • 0 comments

Thanks to the good work, the position embedding initialization is multiplied by a scaling factor, which is not initialized in the original VIT. It is also mentioned in the paper that "use a slightly different initialization scheme". How should this operation be explained

Jan 25 '24 10:01 OliverHuang1220

CLIP CLIP copied to clipboard

about position embedding scale

CLIP
CLIP copied to clipboard