TextguidedATT icon indicating copy to clipboard operation
TextguidedATT copied to clipboard

VGG-FCN t7 file

Open YuanEZhou opened this issue 7 years ago • 6 comments

You had released the Res-101.t7 file to extract image features, can you release the VGG-FCN t7 file ? Thanks.

YuanEZhou avatar Jan 02 '18 12:01 YuanEZhou

When I implemented FCN-based text-guided attention model, I use the pre-trained caffe model from From Captions to Visual Concepts and Back. But, currently the FCN-model and corresponding feature extraction code are missing. Sorry for that, but I think it is simple to load pre-trained caffe model in torch with loadcaffe package and you can easily write code.

JonghwanMun avatar Jan 02 '18 12:01 JonghwanMun

Thanks!

YuanEZhou avatar Jan 03 '18 06:01 YuanEZhou

The "deploy.prototxt" file is on "visual-concepts/output/vgg" folder in prototxt https://github.com/s-gupta/visual-concepts/tree/master/output/vgg . I also extracted features at the layer named by "fc-conv7" (or "relu7") and the size of feature maps was 10x10.

If you have other questions, feel free to ask me :)

2018-01-03 16:53 GMT+09:00 YE Zhou [email protected]:

I follow your advice and downlaod the snapsshot_iter240000.caffemodel from corresponding address, but there is not 'deploy.prototxt' file . Do you modify the original VGG-16 prototxt file by replacing the (fc6, fc7,fc8) with fully convolutional network ? What the dimentions of extracted feature maps given 512*512 images ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonghwanMun/TextguidedATT/issues/6#issuecomment-354953451, or mute the thread https://github.com/notifications/unsubscribe-auth/AGHpNbyYqpEdUOPciTsFuTCG8ROyVZKlks5tGzHsgaJpZM4RQjYq .

JonghwanMun avatar Jan 04 '18 04:01 JonghwanMun

Thank you! I found the deploy.prototxt file. But I still have some confusion as follow: If you extract features at the layer named by "fc-conv7" (or "relu7") and the size of feature maps was 10* 10* 4096. The dimention of feature maps is so big! Why not extract features at the layer name by "fc8_coco" ?

YuanEZhou avatar Jan 04 '18 05:01 YuanEZhou

The output of "fc8_coco" layer is probability for attributes, thus I use the visual features from "fc-conv7" as we usually obtain features from fc7 layer rather than fc8 layer in VGG-net. Also, note that the outputs of ResNet are 14x14x2048 size of features given 448x448 images.

2018-01-04 14:17 GMT+09:00 YE Zhou [email protected]:

Thank you! I found the deploy.prototxt file. But I still have some confusion as follow: If you extract features at the layer named by "fc-conv7" (or "relu7") and the size of feature maps was 10104096. The dimention of feature maps is so big! Why not extract features at the layer name by "fc8_coco" ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JonghwanMun/TextguidedATT/issues/6#issuecomment-355201109, or mute the thread https://github.com/notifications/unsubscribe-auth/AGHpNQC0U4zVlk5DSo57J4SEZcOH_pp2ks5tHF7ZgaJpZM4RQjYq .

JonghwanMun avatar Jan 04 '18 05:01 JonghwanMun

Thank you very much!

YuanEZhou avatar Jan 04 '18 05:01 YuanEZhou