darts icon indicating copy to clipboard operation
darts copied to clipboard

architecture parameters initialization

Open JingweiZhang12 opened this issue 4 years ago • 0 comments

Thanks for your great work! I find some inconsistency about the architecture parameters initialization between your paper and your code. In your paper, you said using zero initialization for architecture variables, which implies equal amount attention over all possible ops. However, in your code, you initialize the architecture variables randomly: ' Variable(1e-3*torch.randn(k, num_ops).cuda(), requires_grad=True)' in your code. Could you please explain why you use this random initialization?

JingweiZhang12 avatar Oct 12 '19 09:10 JingweiZhang12