5G-Federation Monte Carlo vs TD

Monte Carlo vs TD

Open Bahador-Bakhshi opened this issue 4 years ago • 0 comments

"The next most obvious advantage of TD methods over Monte Carlo methods is that they are naturally implemented in an online, fully incremental fashion. With Monte Carlo methods one must wait until the end of an episode, because only then is the return known, whereas with TD methods one need wait only one time step."

But in our application, we don't need to wait until the end of episode. Each action has a return.

Maybe, it is possible to apply Monte Carlo for this problem

Oct 17 '20 11:10 Bahador-Bakhshi

5G-Federation 5G-Federation copied to clipboard

Monte Carlo vs TD

5G-Federation
5G-Federation copied to clipboard