pytorch-rdpg
pytorch-rdpg copied to clipboard
Shouldn't Input of Critic be hidden state of RNN?
Hi Faraz, I am studying the paper and your implementation is very helpful! I have a question though. It seems that the critic network in the paper takes in history -- which in this case is hidden state of the actor's LSTM, I presume -- rather than the observed state of the environment.
https://github.com/fshamshirdar/pytorch-rdpg/blob/master/rdpg.py#L139-L141
I was looking at exactly the same thing. Got your answer?
I think this is a valid concern. Making state information available to the critic makes this implementation incorrect.