motion-planner-reinforcement-learning icon indicating copy to clipboard operation
motion-planner-reinforcement-learning copied to clipboard

Did you manage to solve the action saturation issue?

Open krishanrana opened this issue 5 years ago • 4 comments

Hi, I see in this code you are outputting both linear and angular velocity and would like to know how you got around the action saturation problem?

krishanrana avatar Aug 11 '19 23:08 krishanrana

I got the same problem. When outputting both linear and angular velocity, it got 1 or 0 after some gradient decent. I have checked the code and tested it using Pendulum-v0 environment. But I don't know how to fix it.

ZhengXinyue avatar Nov 18 '19 08:11 ZhengXinyue

I got the same problem. When outputting both linear and angular velocity, it got 1 or 0 after some gradient decent. I have checked the code and tested it using Pendulum-v0 environment. But I don't know how to fix it.

Hello, I also meet the problem of the action saturation, I noticed that the author used a batch normalization after each layer in actor network. I'm wondering if it is the solution to fix the problem and did you fix it now?

ZhihanLee avatar Jul 28 '21 08:07 ZhihanLee

I got the same problem. When outputting both linear and angular velocity, it got 1 or 0 after some gradient decent. I have checked the code and tested it using Pendulum-v0 environment. But I don't know how to fix it.

Hello, I also meet the problem of the action saturation, I noticed that the author used a batch normalization after each layer in actor network. I'm wondering if it is the solution to fix the problem and did you fix it now?

Hello, I have the same problem, do you have a solution? I would be grateful if you could help me!

Mr-Y-B-L avatar Dec 28 '23 12:12 Mr-Y-B-L

I am using SAC and meet the same issue too. I changed many hype-parameters (lr, batch size, initial memory, state format, action range) and the issues have been fixed. I am not sure that which factor is the key, but I am pretty sure that is not batch norm, because I not using batch norm in model design.

phuongboi avatar Mar 16 '24 01:03 phuongboi