5G-Federation icon indicating copy to clipboard operation
5G-Federation copied to clipboard

Monte Carlo vs TD

Open Bahador-Bakhshi opened this issue 4 years ago • 0 comments

"The next most obvious advantage of TD methods over Monte Carlo methods is that they are naturally implemented in an online, fully incremental fashion. With Monte Carlo methods one must wait until the end of an episode, because only then is the return known, whereas with TD methods one need wait only one time step."

But in our application, we don't need to wait until the end of episode. Each action has a return.

Maybe, it is possible to apply Monte Carlo for this problem

Bahador-Bakhshi avatar Oct 17 '20 11:10 Bahador-Bakhshi