Youtube-Code-Repository icon indicating copy to clipboard operation
Youtube-Code-Repository copied to clipboard

noise size equal to number of actions

Open SarodYatawatta opened this issue 3 years ago • 2 comments

https://github.com/philtabor/Youtube-Code-Repository/blob/733e4526f9920e5b710e29077fb85a457eec1ea9/ReinforcementLearning/PolicyGradient/TD3/td3_torch.py#L163

Instead of a scalar noise, should be a vector of number of actions size mu_prime = mu + T.tensor(np.random.normal(scale=self.noise,size=(self.n_actions,)),

SarodYatawatta avatar Mar 01 '21 11:03 SarodYatawatta

We're allowed to add a scalar quantity to a vector. Is there a reason why each component of the mu tensor should have a different random number added to it?

philtabor avatar Aug 03 '21 16:08 philtabor

True, but by making mu tensor perturb by different random numbers, you can increase the exploration (as opposed to using the same random number)

SarodYatawatta avatar Aug 03 '21 19:08 SarodYatawatta