pytorch-soft-actor-critic issues

Model saving and loading

1

Hey, how to use the save_checkpoint and load_checkpoint in the sac.py?

Doubts about Regularization in policy loss

Thank you for your contribution. However I'm confused about `reg_loss = 0.001 * (mean.pow(2).mean() + log_std.pow(2).mean())# Regularization Loss` in the code. Can you explain it, any help will be grateful.

Marxvans

Resume training

5

Hello I am trying to use the SAC agent and resume training, to do that I do: ``` def load_model(actor_path, critic_path, optimizer_actor_path, optimizer_critic_path, optimizer_alpha_path): policy = torch.load(actor_path) self.alpha = policy['alpha'].detach().item()...

Tomeu7

help wanted

Unable to reproduce results on Humanoid-v2 in new SAC

6

I am unable to obtain the result as reported in the paper ‘Soft Actor-Critic Algorithms and Applications ’ on the openai environment Humanoid-v2. The result is 6000 while the original...

zwfightzw

help wanted

continuous cartpole trains

2

jonberliner

Exploding entropy temperature

10

Hi, When I set the automatic_entropy_tuning to be true in an environment with action space of shape 1, my entropy temperature explodes and increases exponentially to a magnitude of 10^8...

reubenwong97

Running SAC: Operation failed to compute its gradient

3

**Environment:** - torch==1.5.0 - mujoco-py==2.0.2.10 **Usage:** `python main.py --env-name Humanoid-v2 --alpha 0.05` **Error:** RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor...

ian-cannon

Action scale and action bias

1

Hi guys, You did a great job here! I'm trying to modify algorithms to my need, and I can't quite get two variables in neuron network classes. What are action_scale...

shakenov-chinga