reinforcement-learning icon indicating copy to clipboard operation
reinforcement-learning copied to clipboard

Vanilla REINFORCE implementation

Open alek5k opened this issue 5 years ago • 2 comments

Hello,

Is there any benefit to having a vanilla REINFORCE algorithm for people trying to learn the concepts? REINFORCE with Baseline includes a value function approximator which has a lot of similarities to the Actor Critic.

I think being able to see a pure policy gradient method could be useful as a learning tool, otherwise people may assume Policy Gradient methods have to have some kind of value function approximation too.

alek5k avatar May 08 '19 04:05 alek5k

Look at this if you want to see the high variance results of Vanilla reinforce

makaveli10 avatar Apr 30 '20 06:04 makaveli10

Can I implement the vanilla REINFORCE ?

vieveks avatar Feb 07 '23 07:02 vieveks