AutoDeeplab The network weights and the architecture weights(coefficient) train together?

The network weights and the architecture weights(coefficient) train together?

Open Linfengscat opened this issue 5 years ago • 2 comments

I think it would be better if we train the network weights and the architecture weights separately, to be exact , frozen the grad of α，β when updating w, also frozen the gradient of w when updating α，β.

By the definition of: 微信图片_20190719180918

Jul 19 '19 10:07 Linfengscat

I believe that the code works this way already. The optimizer of the model only contains wight parameters and the optimizer in architecture does alpha and beta only. Please correct me if it isn't right.

Jul 19 '19 15:07 HankKung

@HankKung Sorry I was careless, Thanks

Jul 22 '19 07:07 Linfengscat

AutoDeeplab AutoDeeplab copied to clipboard

The network weights and the architecture weights(coefficient) train together?

AutoDeeplab
AutoDeeplab copied to clipboard