ENAS-pytorch icon indicating copy to clipboard operation
ENAS-pytorch copied to clipboard

REINFORCE

Open carpedm20 opened this issue 7 years ago • 0 comments

It is clear that controller falls into a local optimal while it can't find better actions from REINFORCE. I think unknown c of c/valid ppl, moving average baseline and temperature of logits are what needed to be fixed. See more details (especially TODOs) in 497c2e717dc0087fea52d4f196d30543e4fb7512.

carpedm20 avatar Feb 15 '18 08:02 carpedm20