ReinforcementLearning.jl icon indicating copy to clipboard operation
ReinforcementLearning.jl copied to clipboard

various eligibility trace-equipped TD methods

Open baedan opened this issue 3 years ago • 4 comments

as far as i can tell, only off-line λ-return is implemented (TDλReturnLearner). any interest in implementing others, such as TD(λ), n-step truncated return, true online TD(λ), and so on? i'm working through a textbook chapter on eligibility traces, and i'm happy to contribute implementations.

baedan avatar Jun 25 '22 09:06 baedan

Hi @baedan , that would be much appreciated if you could have them implemented.

In fact, we also need contributors to work on porting tablar methods in the latest workflow in the master branch.

findmyway avatar Jun 25 '22 15:06 findmyway

Since we've missed the window period to apply for the GSoC or OSPP this year, I'm considering setting up the github sponsorship under this org to raise money for the work.

findmyway avatar Jun 25 '22 18:06 findmyway

In fact, we also need contributors to work on porting tablar methods in the latest workflow in the master branch.

would be great if there’s a document i can refer to for the list of intended changes in the design / a piece of example code demonstrating usage. i tried to use the new implementations for reference (QRDQN, etc) but since i’m not familiar with the underlying algorithms it’s a bit difficult haha

baedan avatar Jun 26 '22 01:06 baedan

Good suggestion. I think the new design is kind of stable now. So I'll focus on documentation part in the next week. I'll ping you when it's ready.

findmyway avatar Jun 26 '22 04:06 findmyway