reinforcement-learning
reinforcement-learning copied to clipboard
Batch update for Continuous Mountain Car Actor-Critic
In https://github.com/dennybritz/reinforcement-learning/blob/master/PolicyGradient/Continuous%20MountainCar%20Actor%20Critic%20Solution.ipynb, I found every time step, the actor and value function are updated
# Update the value estimator
estimator_value.update(state, td_target)
# Update the policy estimator
# using the td error as our advantage estimate
estimator_policy.update(state, td_error, action)
How can I batch update the actor and value function since the overhead of calling tf's session is not small when the network is large.
I have the same question after 4 years. Did you find the answer?