Deep-Cross-Modal-Projection-Learning-for-Image-Text-Matching icon indicating copy to clipboard operation
Deep-Cross-Modal-Projection-Learning-for-Image-Text-Matching copied to clipboard

Why not ResNet

Open FangmingZhou opened this issue 5 years ago • 4 comments

Notice that the results in paper 'Deep Cross-Modal Pojection Learning for Image-Text Matching' are:{top- 1 = 49.37%,top-10 = 79.27%} while the results in this project are {top- 1 = 42.999%,top-10 = 67.869%}, which are resulted from the model that is based on MobileNet. So, why not provide a new version that is based on ResNet! ^^ It will be greatly helpful for our beginners ! Thanks a lot !

FangmingZhou avatar Dec 17 '19 10:12 FangmingZhou

Hello, is your results based on CUHK-PEDES?

wxh001qq avatar Mar 04 '20 07:03 wxh001qq

Hello, is your results based on CUHK-PEDES?

yes

FangmingZhou avatar Mar 06 '20 09:03 FangmingZhou

Hello, is your results based on CUHK-PEDES?

yes

I found whether to use nn.DataParallel() will extremely influence the result. if not use nn.DataParallel() got about {top- 1 = 31%,top-10 = 55%} while use got about {top- 1 = 42%,top-10 = 67%}

wxh001qq avatar Mar 06 '20 14:03 wxh001qq

Hello, is your results based on CUHK-PEDES?

yes

I found whether to use nn.DataParallel() will extremely influence the result. if not use nn.DataParallel() got about {top- 1 = 31%,top-10 = 55%} while use got about {top- 1 = 42%,top-10 = 67%}

我没有用过并行的这个方法,网上好像也没有提到这会导致结果不同的?可能你要请教一下别人了

FangmingZhou avatar Mar 07 '20 07:03 FangmingZhou