Added Policy Gradient Tutorial
*Description Implementation of Vanilla Monte Carlo Policy Gradients on the CartPole-v0 environment added as a tutorial.
*Tests Run the script
@tejank10
update!(opt,params(policy))
does not find a matching candidate. Tried using
grads = Tracker.gradient(() -> loss(state,act,G_t), params(policy))
for p in params(policy)
update!(opt, p, grads[p])
end
but even this throws up errors.
Please add a Project.toml and Manifest.toml as well so it easier to standardize the environment.
@dhairyagandhi96 Added the files
@tejank10 are you happy with the changes made here, or is there more to do?
@MikeInnes Thanks for the reply. I apologize for making a few errors before. The DCGAN code should not have been included in this PR. There is a separate PR for that. I have corrected it by removing the GAN code. The changes you mentioned for the GAN part will be updated in the respective PR.
@tejank10 I have made the changes requested. Sorry for having delayed this for so long. I got into other work and did not fix the errors that were coming up. The changes have been completed now. I have also added functions to normalize the discounted rewards which would aid in training the network.
It will also need to be in its own folder, and have a simple README. Otherwise this is looking good I think, but it'd be good to hear from @tejank10.
@MikeInnes Sorry for the delayed response. I have made the changes. Is the README sufficient for now or is there something more to be added?