G-learning, test with infinite horizon

Open DanielTakeshi opened this issue 8 years ago • 0 comments

It turns out that the G-learning paper doesn't use the episodic setting (at least for the cliff-world setting, which is my main concern). Let's write a new cliff-world environment which isn't episodic and see if this matches their results.

Jan 17 '17 22:01 DanielTakeshi