AutoDeeplab icon indicating copy to clipboard operation
AutoDeeplab copied to clipboard

The network weights and the architecture weights(coefficient) train together?

Open Linfengscat opened this issue 5 years ago • 2 comments

I think it would be better if we train the network weights and the architecture weights separately, to be exact , frozen the grad of α,β when updating w, also frozen the gradient of w when updating α,β.

By the definition of: 微信图片_20190719180918

Linfengscat avatar Jul 19 '19 10:07 Linfengscat

I believe that the code works this way already. The optimizer of the model only contains wight parameters and the optimizer in architecture does alpha and beta only. Please correct me if it isn't right.

HankKung avatar Jul 19 '19 15:07 HankKung

@HankKung Sorry I was careless, Thanks

Linfengscat avatar Jul 22 '19 07:07 Linfengscat