Marco Pleines comments

Results 63 comments of


                                            Marco Pleines

bootstrapped ci (shows no variance) vs std (shows high variance)

Thanks for your reply @agarwl My current take is to have the data in the shape of `(5 runs, 150 episodes, 101 checkpoints)`. Compared to your terms: `checkpoints = frames`,...

Add PPO + Transformer-XL

### pre-commit pre-commit fails because of two "obsolet" imports: memory_gym and PoMEnv. Without those imports, the environments are not registered inside gymnasium. ### enjoy.py I added a script to load...

Add PPO + Transformer-XL

Hi @roger-creus the benchmarks just completed. So the next step is to prepare the reports and then to write the docs.

Add PPO + Transformer-XL

It reproduces the results of my paper: [https://arxiv.org/abs/2309.17207](https://arxiv.org/abs/2309.17207) and this is the original implementation: [https://github.com/MarcoMeter/neroRL](https://github.com/MarcoMeter/neroRL)

Add PPO + Transformer-XL

IMHO, here are the remaining TODOs of this PR: - [x] Upload trained models to HuggingFace - [x] Download and run these models using /cleanrl/ppo_trxl/enjoy.py - [x] Rename `blocks` to...

Add PPO + Transformer-XL

> Do you know why the wandb chart looks like this? > What are you referring to? This is how I created the report: ``` @echo off python -m openrlbenchmark.rlops...

Add PPO + Transformer-XL

It seems that other reports have this as well, like: https://wandb.ai/openrlbenchmark/cleanrl/reports/CleanRL-PPG-vs-PPO-results--VmlldzoyMDY2NzQ5

Add PPO + Transformer-XL

I did some refinements: - Added hyperparameters to the docs for training MiniGrid-Memory-S9-v0 and ProofOfMemory-v0 - Added pre-trained models to huggingface for these envs - ProofOfMemory-v0 can be adequately rendered...

Add PPO + Transformer-XL

> My last step before merging is to make sure that poetry and the dependencies blend well. Done.

Any usage of poetry after installation: No module named 'tomli'

Hi @vwxyzjn I have to reopen this issue as I'm facing this again while attempting to execute the benchmarks for [pr459](https://github.com/vwxyzjn/cleanrl/pull/459) via `./benchmark/ppo_trxl.sh` on my institution's slurm cluster. Do you...