miniyosshi

Results 1 comments of miniyosshi

Hi, in relation to this problem, I found env doesn't get an action of very first iteration. In training loop, `prev_action = torch.zeros(1, trainer.action_size).to(trainer.device) # initialize` ... `next_obs, rew, done,...