reinforcement-learning Vanilla REINFORCE implementation

Vanilla REINFORCE implementation

Open alek5k opened this issue 5 years ago • 2 comments

Hello,

Is there any benefit to having a vanilla REINFORCE algorithm for people trying to learn the concepts? REINFORCE with Baseline includes a value function approximator which has a lot of similarities to the Actor Critic.

I think being able to see a pure policy gradient method could be useful as a learning tool, otherwise people may assume Policy Gradient methods have to have some kind of value function approximation too.

May 08 '19 04:05 alek5k

Look at this if you want to see the high variance results of Vanilla reinforce

Apr 30 '20 06:04 makaveli10

Can I implement the vanilla REINFORCE ?

Feb 07 '23 07:02 vieveks

reinforcement-learning reinforcement-learning copied to clipboard

Vanilla REINFORCE implementation

reinforcement-learning
reinforcement-learning copied to clipboard