Thomas Spooner

Results 14 comments of Thomas Spooner

Hey, You're absolutely right. I'm not sure why, but this seems to be yet another discrepancies between my private development repo and this public one - as in your issue...

Honestly, I stopped using multi-threaded training quite some time before the main results of the paper were found. It doesn't surprise me much that it is broken. I realise that's...

Yeah, `OnlineRLearn` is the on-policy R-learning algorithm that was introduced by Sutton. It's the equivalent of Q-learning for continuing tasks - i.e. it solves for a different objective: the expected...

Hey @dichen9412 and @mbasso! Sorry for the delayed response - been very busy with follow up work. First off, I appreciate that there is a lack of documentation with this...