Maxime RICHE

Results 7 issues of Maxime RICHE

Can I ask why have you deleted this content ? A legal problem ? (I implemented it too...)

Are the current default hyper-parameters the one used to produce the results of the [DICE paper](https://arxiv.org/pdf/1802.05098.pdf)? Current default HP are (from [scripts/run_lola_dice.py](https://github.com/alshedivat/lola/blob/master/scripts/run_lola_dice.py)): ``` batch-size=64 runs=5 epochs=200 use_dice=True gamma=.96, lr_inner=.1, lr_outer=.2,...

In the notebook [notebooks/dice/analysis.ipynb](https://github.com/alshedivat/lola/blob/master/notebooks/dice/analysis.ipynb) which is used to analyse the results and reproduce the fig.5 from the paper [DiCE: The Infinitely Differentiable Monte Carlo Estimator](https://arxiv.org/pdf/1802.05098.pdf), the confidence interval used is...

In https://github.com/alshedivat/lola/blob/master/lola/envs/coin_game.py The symmetry is broken in favor of player red. When the two players move at the same time on the cell with the coin, player red has the...

Remove use lock_replay during training (must not use it in LTFT). Create submodule marltoolbox.utils.log. Move methods to summarize a model into an helper class. use before_init_loss instead of after_init (policy...

To add in https://github.com/longtermrisk/marltoolbox/tree/master/tests/marltoolbox/examples. The quick test in test_end_to_end.py The long test in manual_test_end_to_end.py