Khanh Nguyen

Results 3 comments of Khanh Nguyen

Always returning tail sums may not be optimal. Sometimes you want to do something with reward like rescaling, injecting noise, exponentiate, etc., It think it is better to return per-step...