InsightFace_Pytorch icon indicating copy to clipboard operation
InsightFace_Pytorch copied to clipboard

maybe few differences with mxnet

Open ruiming46zrm opened this issue 6 years ago • 7 comments
trafficstars

Hi, @TreB1eN , I'm learning your codes for few days ,it's very nice work, but compared the network with mxnet version: 2 key differences found as following: 1 . in bottleneck, res layer, there might be a bn between first conv and prelu as mxnet 2 . your method of short_cut :

       if depth == in_channel:
           self.shortcut_layer = MaxPool2d(1, stride)
       else:
           self.shortcut_layer = Sequential(
               Conv2d(in_channel, depth, (1, 1), stride ,bias=False), BatchNorm2d(depth))

lead the first get_block, first unit's short_cut become MaxPool2d(1,2) but not Conv+BN because of the input data channel=64=depth, that's diffferent with mxnet codes. maybe use :

          if stride == 1:
                ....
          else:
                ....

I don't know whether it would be the reason of low megaface results, and other differences like : mxnet seemed no use se model to train res50, why do we use? and the drop_ratio = 0.4 in mxnet but 0.6 in torch.
do you have any suggestions?

I will change the above and train

ruiming46zrm avatar Jan 10 '19 02:01 ruiming46zrm

Yes, I noticed the difference a while ago, the drop_ratio is a mistake, the choice of se_model is just random.

Please issue a PR if you can get better performace

TreB1eN avatar Jan 10 '19 02:01 TreB1eN

by modified network , large batch_size = 384 (4 GPUs), lr = 0.1 , milestone = [3,6,9,12],drop ratio = 0.4 . train ir_se50, get :

   lfw accrancy : 99.78%
   agedb-30 : 97.58%
   megaface rank 1 result : 96.4%

I think the drop ratio and batch size affect a lot

ruiming46zrm avatar Jan 15 '19 06:01 ruiming46zrm

@ruiming46zrm can you share the modification you made in detail. Especially the network.

bnu-wangxun avatar Jan 24 '19 09:01 bnu-wangxun

@bnulihaixia as above : 1. shortcut; 2:add bn .
you may try to use large batch and large lr to get better result

ruiming46zrm avatar Feb 01 '19 07:02 ruiming46zrm

@ruiming46zrm thanks for your response.

bnu-wangxun avatar Feb 02 '19 02:02 bnu-wangxun

Hi, Do we need to normalize the images of Facescrub and Megaface before feeding them to the model to get features? Thank you for reading.

ghost avatar Sep 20 '19 13:09 ghost

Did you do it on purpose? Why did you modify it like this? Or was it unintentional? @TreB1eN

konioyxgq avatar Apr 22 '20 07:04 konioyxgq