How to Correctly Integrate LSTM or GRU into the SAC Algorithm

Open namjiwon1023 opened this issue 2 years ago • 1 comments

I have referred to some people's work on adding RNNs to reinforcement learning algorithms, but strangely, almost everyone's code implementation is different. So I would like to ask how you integrate LSTM or GRU into the SAC algorithm.

In my implementation, I have incorporated LSTM into both the actor and critic networks. The image below shows the LSTM added to the actor network.

And during training, I initialize the hidden state input of the LSTM.

I also initialize the input hidden state when the environment is reset.

I would like to ask if my method of adding this is correct. How did you incorporate RNN into SAC when you did it?

Thank you, I look forward to your reply.

Jan 17 '24 12:01 namjiwon1023

Do you aware of any reference implementations? There are couple of ways how it can be done. Problem tht in PPO I am reusing old hidden state from previous step but in SAC you can have very old sequences so you cannot reuse hidden state.

Jan 22 '24 18:01 Denys88