David Slayback
David Slayback
Sorry for not giving any updates for a while. I'm running into a couple issues following the guidelines 1) The official reference implementations (4 [based](https://github.com/mklissa/PPOC) [on](https://github.com/anandkamat05/TDEOC) [openai-baselines](https://github.com/kkhetarpal/ioc) [ppo](https://github.com/mklissa/MOC)) differ from...
No, only the combination of all 3, plus whatever's going on in the gist I linked. Sorry I couldn't refine it down to a smaller reproduction script :(. It occurs...
To be clear, that code block was me attempting to create a more minimal working example and failing! I put the print statements in prematurely. I can only provoke the...
@vwxyzjn Yeah, sorry, I linked those implementations to show the divergence of even the basics of option critic implementations. While each proposes a new technique, they also make a lot...
Just adding my own experience as a notebook! I'm trying to draw basic walls and find using boxes more intuitive, but the physics get interesting. The alternative version (extend_ant_cfg instead...
I think the problem was solved with version 0.0.12 (with the new physics engine). I included a reproduction below: https://colab.research.google.com/gist/DavidSlayback/0611a3cfff4871a33bae9bd5c04b08d7/copy-of-braxscratch.ipynb The divide-by-0 error is gone, and the ant actually makes...
Yeah, 2048 environments have a lot of randomness built in, but if I'm trying to solve a reasonably-sized generalized task (like a procedurally generated maze or foraging task) instead of...
Got it, thanks again! For the first question, there's no weirdness with using one key to draw a vector of 2 random numbers from them same range (each is -4.5...
Thank you again! I realize you've got plenty of other stuff to work on even just with Brax, appreciate the thorough answers. I'll probably go with your second option. As...
Update on the reset numbers: https://colab.research.google.com/gist/DavidSlayback/bf5038ec024bb6e47568af2e2ba99c16/autoreset.ipynb#scrollTo=gazgx0KXWJfw So I implemented a couple basic strategies using your built-in "fetch" environment just to keep the notebook simple: 1) Original AutoReset (same "first state"...