Deep-QLearning-Agent-for-Traffic-Signal-Control icon indicating copy to clipboard operation
Deep-QLearning-Agent-for-Traffic-Signal-Control copied to clipboard

Reward

Open kgayush opened this issue 3 years ago • 1 comments

Reward is sum of cumulative wait time, right? How it is going negative in the graph(after running testing_main.py)? plot_reward

kgayush avatar Apr 17 '21 10:04 kgayush

I think the reward is defined as the difference of cumulative wait time between the action intervals. So positive or negative rewards will be recevied.

GangSuUGA avatar Aug 03 '21 14:08 GangSuUGA