hourglass-facekeypoints-detection icon indicating copy to clipboard operation
hourglass-facekeypoints-detection copied to clipboard

training is good, eval is worse

Open ChiefGodMan opened this issue 6 years ago • 7 comments

When I trained hg model using 300w datasets, and the training result showed is good, but is worse when I use eval mode. Maybe it's because of the bn layer. Have you met this problem? What should I do for it?

ChiefGodMan avatar Sep 20 '18 04:09 ChiefGodMan

@ailias No, I didn't meet the problem. Did you set net.eval() before evaluation?

raymon-tian avatar Sep 21 '18 08:09 raymon-tian

@ailias ,I meet the same problem just like you. After finishing training, if I set the net to eval, then output of the net seems to be much more worse than the train mode:(

thruster1996 avatar Oct 23 '18 13:10 thruster1996

I have met the same problem when training the hg model on MPII dataset. I'm training on a Titan Xp GPU, Pytorch 0.4.1. Changing hg.py, line 22-23

self.res2 = Residual(128, 128)
self.res3 = Residual(128, self._nFeats)

to

self.res2 = Residual(128, self._nFeats)
self.res3 = Residual(self._nFeats, self._nFeats)

solved my problem. Hope it will help you too.

TingmanYan avatar Dec 07 '18 03:12 TingmanYan

When I trained hg model using 300w datasets, and the training result showed is good, but is worse when I use eval mode. Maybe it's because of the bn layer. Have you met this problem? What should I do for it?

Can you tell me how to run the code ? python train.py

Traceback (most recent call last): File "train.py", line 123, in net = KFSGNet()

TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:

  • (torch.device device)
  • (torch.Storage storage)
  • (Tensor other)
  • (tuple of ints size, torch.device device)
  • (object data, torch.device device)

Thank you very much . Best wishes.

4Statistics avatar Mar 24 '19 02:03 4Statistics

请问能将训练集分享出来吗

silvercherry avatar Sep 18 '19 03:09 silvercherry

When I trained hg model using 300w datasets, and the training result showed is good, but is worse when I use eval mode. Maybe it's because of the bn layer. Have you met this problem? What should I do for it?

Can you tell me how to run the code ?

python train.py Traceback (most recent call last): File "train.py", line 123, in net = KFSGNet()

TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:

  • (torch.device device)
  • (torch.Storage storage)
  • (Tensor other)
  • (tuple of ints size, torch.device device)
  • (object data, torch.device device)

Thank you very much . Best wishes.

have you solve this problem?

silvercherry avatar Sep 19 '19 05:09 silvercherry

in models.py, from line 64, should be modified: nn.Conv2d(ins,int(outs//2),1), nn.BatchNorm2d(int(outs//2)), nn.ReLU(inplace=True), nn.Conv2d(int(outs//2),int(outs//2),3,1,1), nn.BatchNorm2d(int(outs//2)), nn.ReLU(inplace=True), nn.Conv2d(int(outs//2),outs,1)

ilaij0810 avatar Aug 10 '20 06:08 ilaij0810