DecisionTransformerInterpretability icon indicating copy to clipboard operation
DecisionTransformerInterpretability copied to clipboard

Investigate whether anyone else does/ just experiment with finetuning of PPO models without entropy at the end of training to remove entropy optimising behaviors.

Open jbloomAus opened this issue 1 year ago • 1 comments

See discussion in #43

jbloomAus avatar Apr 17 '23 22:04 jbloomAus