Atari
Atari copied to clipboard
Implement Retrace(λ)
Safe and efficient off-policy reinforcement learning implements this new algorithm with experience replay, but actually uses asynchrononous agents with experience replay for testing (the combination was going to happen soon enough). Which means that this repo is a unique position of having both components already implemented.