Steven H. Wang issues

Results 8 issues of


                                            Steven H. Wang

AdversarialTrainer: Silent incompatibility with SB3 learning rate schedules

h/t @qxcv `AdversarialTrainer.train()` will repeatedly call `PPO.learn(total_timesteps=gen_batch_size, reset_num_timesteps=False)` where `gen_batch_size` is usually a small number compared to conventional RL training. Whether or not `reset_num_timestep=False`, `PPO` doesn't know the actual number...

`modelfree.utils.sacred_copy` no longer necessary

https://github.com/HumanCompatibleAI/adversarial-policies/blob/3a273ea9b7a02c34f95917bb56c1473e9a1af3eb/src/modelfree/common/utils.py#L44 https://github.com/IDSIA/sacred/issues/498 was resolved.

Simplify `pp.inference` API

The `pp.inference` API is a mess. All but two of the dozen or so functions available were used for experimentation/troubleshooting and are irrelevant for the paper, so it might be...

[bug] PPO2 episode reward summaries are written incorrectly for VecEnvs

Episode reward summaries are all concentrated together on a few steps, with jumps in between. Zoomed out: ![image](https://user-images.githubusercontent.com/1750835/50369978-20aace00-0553-11e9-91a8-334ca4f405c4.png) Zoomed in: ![image](https://user-images.githubusercontent.com/1750835/50370046-6ddb6f80-0554-11e9-914f-8b2b8c45270f.png) Every other summary looks fine: ![image](https://user-images.githubusercontent.com/1750835/50370120-b9dae400-0555-11e9-98b6-92badee5c622.png) To reproduce, run...

bug

Should `TensorboardWriter` close its `tf.summary.FileWriter`?

PPO2 uses a `with TensorboardWriter(...) as writer:` context that `flush`es but doesn't ever close its `tf.summary.FileWriter`. This led to (in combination with another problem on my side) a "too many...

bug

help wanted

Steven H. Wang

AdversarialTrainer: Silent incompatibility with SB3 learning rate schedules

`modelfree.utils.sacred_copy` no longer necessary

Simplify `pp.inference` API

[bug] PPO2 episode reward summaries are written incorrectly for VecEnvs

Should `TensorboardWriter` close its `tf.summary.FileWriter`?

Add --cpu flag

Ignore data files

readme: Fix example command's `--model` option