phasic-policy-gradient
phasic-policy-gradient copied to clipboard
Add evaluation logging
This commit adds periodic logging of evaluation scores for the policy being trained.
It also adds num_levels and start_level to the arguments.
Based on code from @rraileanu