BipedalWalkerHardcore-SAC icon indicating copy to clipboard operation
BipedalWalkerHardcore-SAC copied to clipboard

why not update alpha?

Open scirocc opened this issue 4 years ago • 1 comments

#alpha_loss = -(self.log_alpha * (log_pi + self.target_entropy).detach()).mean() #self.alpha_optim.zero_grad() #alpha_loss.backward() #self.alpha_optim.step() why not update?

scirocc avatar Feb 06 '21 10:02 scirocc

it's a experiment to test whether it is better to use fixed alpha, because I found a result in the SAC paper showing that it might be better to use fixed alpha if alpha is good.

CoderAT13 avatar May 21 '21 01:05 CoderAT13