Batch update for Continuous Mountain Car Actor-Critic

Open GoingMyWay opened this issue 7 years ago • 1 comments

In https://github.com/dennybritz/reinforcement-learning/blob/master/PolicyGradient/Continuous%20MountainCar%20Actor%20Critic%20Solution.ipynb, I found every time step, the actor and value function are updated

# Update the value estimator
estimator_value.update(state, td_target)
            
# Update the policy estimator
# using the td error as our advantage estimate
estimator_policy.update(state, td_error, action)

How can I batch update the actor and value function since the overhead of calling tf's session is not small when the network is large.

Oct 21 '18 13:10 GoingMyWay

I have the same question after 4 years. Did you find the answer?

Apr 07 '22 22:04 sharlec