tensorflow-rl
tensorflow-rl copied to clipboard
PseudoCountQLearner
In CTS-DQN, why we update the CTS model by using the next frame but not the same frame as used by action selection? refer to https://github.com/steveKapturowski/tensorflow-rl/blob/master/algorithms/intrinsic_motivation_actor_learner.py#L417