pytorch-a3c
pytorch-a3c copied to clipboard
PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
After value, logit, (hx, cx) = model((Variable(state.unsqueeze(0)),(hx, cx))) in train.py, the program doesn't go on. Do you have any idea?
Hi, First of all, thank you very much for your easy-to-follow implementation! Very intuitive and simple. :+1: My question is about your use of LSTMCell to implement the recurrent version...
I'm training a A3C these days, but the NN always take the same action, after some steps. The game I train for is similar to playing Go. There will be...
Program stuck in `p.join()(77 line, in 'main.py')`. More specifically, it will stuck when use 'model' in 'test.py' and 'train.py', such as `'model.load_state_dict(shared_model.state_dict())'( line 35, in 'test.py')`. Actually, I tried to...
I can't find a clear dependency list for this project, such as a `requirements.txt` or `environment.yml` file. This is problematic, for example if I spin up the repo on Colab,...
For A3C, there are many learners which learning from the interactions generated in local actors. In some iteration, the learner has reach the condition that needs to update the model....
"action = prob.multinomial(num_samples=1).detach()" in 59 lines of train.py. may i use epsilon-greedy strategy to choose an action?
hi, I wanted to test your code on my platform but there seems to be an error. Can you please help me fix it, I have attached the error log....
https://github.com/ikostrikov/pytorch-a3c/blob/48d95844755e2c3e2c7e48bbd1a7141f7212b63f/train.py#L81 From this line including 3 following lines, ``` R = torch.zeros(1, 1) if not done: value, _, _ = model((state.unsqueeze(0), (hx, cx))) R = value.detach() ``` Why does value...