cgd
cgd copied to clipboard
What is the meaning of 'non-conventional usage' of backbone at Table 7 in the Paper?
Thank you for your great job of this repo and paper! I notice that the result of ResNet-50 with non-conventional usage has the best performance. I want to know how to implement this ' non-conventional usage'. Does it mean 'discarding the down-sampling operation between stage3 and stage4' in section 3.1 of the paper? Thanks a lots.
Thanks for the interest in our paper. For your question, it means 'discarding the down-sampling operation between stage3 and stage4' in section 3.1 of the paper. So, you are right :)
Thank you for your response ! So, 'discarding the down-sampling operation between stage3 and stage4' is changing the last stride of Resnet layer4 from 2 into 1 ? That is, last stride = 1 ?
These days I reimplement your great work with pytorch. When evaluating in CUB-200-2011, I just get 62% Recall@1. I believe I miss some important details. So, some questions raising:
- Batch sample: shuffer all samples and get 128 samples in a batch? or use P-K sampling format(P-classes, K samples per class)
- I find that without L2 norm and FC after GD( like another issue said) could get higher performance(can't still reach your proposed results in CUB-200). Do you know the reasons about this?
- Could you share some training tricks? Looking forward to your guidance. Thank you so much!