baby-a3c
baby-a3c copied to clipboard
A high-performance Atari A3C agent in 180 lines of PyTorch
https://github.com/greydanus/baby-a3c/blob/85899d76bd1a72d9e9055ed35894390802e31240/baby-a3c.py#L72-L78 Why did you override the default implementation of step(closure)? The default one calculates exponential moving average. Your implementation doesn't calculate the step count because it always returns None. I...
The shared_param.grad is synced only when it is None here https://github.com/greydanus/baby-a3c/blob/master/baby-a3c.py#L159. I am kind of confused. I think we have to sync it without the condition above. That means we...
Dear Author, I think when an episode is done, hx should be reset. I am not sure whether it's a bug in [https://github.com/greydanus/baby-a3c/blob/master/baby-a3c.py#L144](url)
Thanks for your great implementation. Currently Iam trying to translate it to TF2 implementation. But I find it difficult for me to understand SharedAdam part and do not know how...
Hi, I believe we should break out of the ```for step in range(args.rnn_steps):``` loop when ```done == True```. Currently, when the environment indicates that the episode is done, the loop...
Hi, I've run the script in training mode, and even after the training was over, if I then run it in test or render mode I was given the "no...