pytorch_a3c icon indicating copy to clipboard operation
pytorch_a3c copied to clipboard

How to modify code for continuous actions?

Open ajaytalati opened this issue 7 years ago • 3 comments

Hi @rarilurelo,

can I ask if you have been able to modify your code to work with continuous actions - eg pendulum or mountain car? I tired to modify @ikostrikov 's implementation, see here

https://discuss.pytorch.org/t/continuous-action-a3c/1033

but could not get it too work? I think @pfre00 has tried too, but he said training was not stable, see here

https://github.com/pfre00/a3c/issues/1

Have you got any advice?

Kind regards,

Ajay

ajaytalati avatar Mar 16 '17 05:03 ajaytalati

Hi, any chance you could give me some advice? I'm still stuck trying to get this to work? Here's a post of my code

https://gist.github.com/AjayTalati/184fec867380f6fa22b9aa0951143dec

I keep getting this error,

File "main_single.py", line 174, in <module>
value_loss = value_loss + advantage.pow(2)
AttributeError: 'numpy.ndarray' object has no attribute 'pow'

I don't understand why advantage has become a numpy array instead of a torch.tensor - it never occurred with the discrete action implementation?

Any ideas what I've got wrong?

Thanks a lot for your help,

Best,

Ajay

ajaytalati avatar Mar 18 '17 17:03 ajaytalati

@AjayTalati I don't know why this error occurs, but I can solve this problem by replacing L137 to rewards.append(float(max(min(reward, 1), -1))). (add float function)

I found another error in backpropagation about stochastic function. I suggest that you use reinforce method.

rarilurelo avatar Mar 22 '17 01:03 rarilurelo

Hi @rarilurelo, thank you very much for your help :+1:

I will do as you suggest, and try to modify the code from the .reinforce example in the PyTorch examples,

https://github.com/pytorch/examples/blob/master/reinforcement_learning/reinforce.py

I wonder if you know of any examples of how to use .reinforce on batch problems? Perhaps something very simple/synthetic, that does not use a gym environment?

ajaytalati avatar Mar 22 '17 10:03 ajaytalati