Deep-QLearning-Agent-for-Traffic-Signal-Control
Deep-QLearning-Agent-for-Traffic-Signal-Control copied to clipboard
Why are Q network and target network the same?
Usually, the Q Network is trained while the parameters of target network are fixed. And every certain steps, the parameters of Q Network will be copied to Target Network. But when I check your code, I find that the Q Network and the Target Network are the same neural network, which confuses me. Could you please help me out?
你说的机制是为了减少训练的震荡,这个demo项目就没采用这个机制,直接每步都让q网络更新呗,这有啥confuse的。。。。。。。