robust_RL_multi_adversary
robust_RL_multi_adversary copied to clipboard
We investigate the effect of populations on finding good solutions to the robust MDP
Hello! I want to have a better understanding of this code, so I need to get the distribution of action, such as the mean and standard deviation of Gaussian distribution....
Added Bernoulli Bandit and appropriately updated run scripts.
The ma_crowd script fails for me with the folowing error: ``` File "/Users/kanaad/miniconda3/envs/sim2real/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 415, in train raise e File "/Users/kanaad/miniconda3/envs/sim2real/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 401, in train result = Trainable.train(self) File "/Users/kanaad/miniconda3/envs/sim2real/lib/python3.6/site-packages/ray/tune/trainable.py",...
Because of the import of env_creator
The RAM usage of the training goes up and up over time, leading to a failed experiment. Why?
The adversary applies actions at the latents of the autoencoder
Currently the rollout of a trained policy test generates a video but does not clear it up after.