pytorch-a3c issues

Can't work on Ubuntu 16.04

22

After value, logit, (hx, cx) = model((Variable(state.unsqueeze(0)),(hx, cx))) in train.py, the program doesn't go on. Do you have any idea?

caozhenxiang-kouji

Scepticism about the correctness of the use of the LSTMCell

Hi, First of all, thank you very much for your easy-to-follow implementation! Very intuitive and simple. :+1: My question is about your use of LSTMCell to implement the recurrent version...

alirezakazemipour

After some steps, all the NNs always output same action

1

I'm training a A3C these days, but the NN always take the same action, after some steps. The game I train for is similar to playing Go. There will be...

Eify666666

Program stuck in `p.join()(77 line, in 'main.py')`. More specifically, it will stuck when use 'model' in 'test.py' and 'train.py', such as `'model.load_state_dict(shared_model.state_dict())'( line 35, in 'test.py')`. Actually, I tried to...

RickWangww

Dependency list not provided (environment.yml file)

I can't find a clear dependency list for this project, such as a `requirements.txt` or `environment.yml` file. This is problematic, for example if I spin up the repo on Colab,...

MasterScrat

Why do we reverse rewards?

1

npitsillos

How does A3C aggregate the model from different learner?

1

For A3C, there are many learners which learning from the interactions generated in local actors. In some iteration, the learner has reach the condition that needs to update the model....

dxu23nc

with respect to how to choose an action

"action = prob.multinomial(num_samples=1).detach()" in 59 lines of train.py. may i use epsilon-greedy strategy to choose an action?

obitoquilt

NotImplementedError

6

hi, I wanted to test your code on my platform but there seems to be an error. Can you please help me fix it, I have attached the error log....

ebasatemesgen

Question in train.py

https://github.com/ikostrikov/pytorch-a3c/blob/48d95844755e2c3e2c7e48bbd1a7141f7212b63f/train.py#L81 From this line including 3 following lines, ``` R = torch.zeros(1, 1) if not done: value, _, _ = model((state.unsqueeze(0), (hx, cx))) R = value.detach() ``` Why does value...

verystrongjoe

pytorch-a3c
pytorch-a3c copied to clipboard

Metadata

Can't work on Ubuntu 16.04

Scepticism about the correctness of the use of the LSTMCell

After some steps, all the NNs always output same action

Stuck in 'p.join()'

Dependency list not provided (environment.yml file)

Why do we reverse rewards?

How does A3C aggregate the model from different learner?

with respect to how to choose an action

NotImplementedError

Question in train.py

← Metadata

Owner

Metadata

pytorch-a3c pytorch-a3c copied to clipboard

Metadata

← Metadata

Owner

Metadata

pytorch-a3c
pytorch-a3c copied to clipboard