Reinforcement-learning-with-tensorflow
Reinforcement-learning-with-tensorflow copied to clipboard
Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学
I encountered this problem when using gym: ValueError: invalid literal for int() with base 10: 'None' It appears at 'env.render()' My python version is 2.7 Could anyone tell me what...
Hello Zhou: I get confused about how does the Reward work to guide the PPO to train the ANNs? 1、For example,I feed a batch_size data to the ANNs,then I will...
众所周知,RL训练及其不稳定,相信morvan在训练的时候也有很多小技巧,可以share一下么?还有,我下载DDPG代码训练后达不到视频中的效果,是什么原因呢
https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow/blob/97dba9bafce7fb5203d395ba77a770fad80931b3/contents/10_A3C/A3C_continuous_action.py#L131 130行环境返回回来的done(游戏是否结束). 被131行 (该episode是否到达最后一步)强行覆盖了. 也就是说,环境里面游戏结束, 这一轮episode也不会结束.
Hi MorvanZhou, I have a question about the state size. I read the comments that you have added. However, I could not properly understand the state space size and how...
`env = gym.make('Pendulum-v0').unwrapped ppo = PPO() all_ep_r = [] for ep in range(EP_MAX): s = env.reset() buffer_s, buffer_a, buffer_r = [], [], [] ep_r = 0 for t in range(EP_LEN):...
" with tf.variable_scope('eval_net'): # c_names(collections_names) are the collections to store variables c_names, n_l1, w_initializer, b_initializer = \ ['eval_net_params', tf.GraphKeys.GLOBAL_VARIABLES], 10, \ tf.random_normal_initializer(0., 0.3), tf.constant_initializer(0.1) # config of layers # first...