Human-in-the-loop-Deep-Reinforcement-Learning
Human-in-the-loop-Deep-Reinforcement-Learning copied to clipboard
question about actor loss
Hi, In the paper, actor loss is but the code that calculate actor loss for human intervention steps didn't consider the first term (see https://github.com/wujingda/Human-in-the-loop-Deep-Reinforcement-Learning/blob/main/TD3_based_DRL/TD3HUG.py#L148) Also, the human intervention weight in actor loss in the code has a soft update coefficient which isn't included in the paper, and I don't understand what this coefficient is for (see https://github.com/wujingda/Human-in-the-loop-Deep-Reinforcement-Learning/blob/main/TD3_based_DRL/TD3HUG.py#L144). Are these bugs in the code or some tricks that I misunderstand? Looking forward for your help.