VinVL
VinVL copied to clipboard
Change the size of the region features
Hi, Is it possible to change the region feature size from 2048 to 500 for finetuning captioning model? If yes, what should I change? Alternatively is it ok to use padding with 0?
I would like to finetune the captioning model by changing the object detection model and using one that returns region features with size smaller than 2048. @xjli @pzzhang @xiaoweihu
Thanks!