Jost Tobias Springenberg
Results
1
comments of
Jost Tobias Springenberg
There seems to be a bug in your implementation: as far as I can see you are calculating maxaction based on q_vals (which contains the Q values for s_t and...