Deep-Reinforcement-Learning-Algorithms-with-PyTorch [Question] How was the target entropy in the discrete SAC chosen?

[Question] How was the target entropy in the discrete SAC chosen?

Open aivarsoo opened this issue 1 year ago • 0 comments

trafficstars

Hello! I have a question on the discrete SAC design.

What was the reasoning for choosing the target entropy in the discrete SAC? If I understand correctly the target entropy represents the ideal entropy of the optimal policy. If so why it is -0.98 * log( 1 / |A|)?

Jan 13 '24 09:01 aivarsoo

Deep-Reinforcement-Learning-Algorithms-with-PyTorch Deep-Reinforcement-Learning-Algorithms-with-PyTorch copied to clipboard

[Question] How was the target entropy in the discrete SAC chosen?

Deep-Reinforcement-Learning-Algorithms-with-PyTorch
Deep-Reinforcement-Learning-Algorithms-with-PyTorch copied to clipboard