TensorFlow2.0-for-Deep-Reinforcement-Learning
TensorFlow2.0-for-Deep-Reinforcement-Learning copied to clipboard
TensorFlow 2.0 for Deep Reinforcement Learning. :octopus:
You have this code when init sigma: `sigma_initializer = tf.constant_initializer(self.std_init / np.sqrt(self.units))` Going into the original paper (section 3.2) I would assume the init to be like this: `sigma_initializer =...
you get next_state from self.get_n_step_info(self.n_step_buffer, self.gamma), but the next_state is not used. may be self.store_transition(p, obs, action, reward, next_obs, done) should be self.store_transition(p, obs, action, reward, next_state, done).