DeepAlignmentNetwork icon indicating copy to clipboard operation
DeepAlignmentNetwork copied to clipboard

Why no weight decay?

Open BKZero opened this issue 6 years ago • 2 comments

i have a question about the code. Generally, when training the neural network, L2 normalization or the so called weight decay will be added to the total loss. but in your code and the zjj implementation on Tensorflow, it seems like the loss is only the distance between the labels and prediction. do i understand the code wrong? or i miss the weight decay factor? i did not found any discuession in your paper. i am curious that if you add weight decay to your loss. if not, why not to add weight decay? or is this a trick on training network on regression problem? thank you for your reply.

BKZero avatar May 08 '18 03:05 BKZero

Hi,

You are correct, there is no weight decay in DAN. If I remember correctly, I tried using it, and it did not improve the accuracy. I also conducted some test on landmark stability (jitter) and found that adding weight decay decreases the jitter, but so does early stopping.

Best regards,

Marek

MarekKowalski avatar May 08 '18 08:05 MarekKowalski

ok~thanks very much again~

BKZero avatar May 08 '18 08:05 BKZero