Joseph Bloom
Joseph Bloom
I failed to get the TransformerPPO model working. I suspect this is because of bugs, but it could also be that transformers are inherently unstable. If someone is interested in...
My initial plan was to look at how decision transformers solved various tasks and compare this to trajectory transformers training with online RL (via PPO), however this may be difficult...
I recently wrote a version of get_minibatches in the memory class of the ppo subpackage. https://github.com/jbloomAus/DecisionTransformerInterpretability/blob/c84edb381c53b3f9ef2fa9517e34914a52e15fbd/src/ppo/memory.py#L210-L393 TLDNR: This is important for sampling sections of trajectories which is necessary for online...
I currently host [a public version of the interpretability app](https://jbloomaus-decisiontransformerinterpretability-app-4edcnc.streamlit.app/) with the streamlit community, which is very slow. I would be willing to pay for a service to host the...
It's now possible to allow columns (up to one level of nesting) inside columns in streamlit apps thanks to https://github.com/streamlit/streamlit/pull/5941. I think this creates an oppurtunity to make the streamlit...
Selective Noise Injection (SNI) and Information Bottleneck Actor-Critic (IBAC) make models better at generalising (including in at least one MiniGrid environment). It seems like a fun hack-dayish kind of effort...
This is a bug in the trajectorywriter/offline dataset where we end up truncating some trajectories when we finish online training and this leads to having “short” truncated trajectories, which are...
I've explained how to use the repo in the docs, but this could be way better. Some examples of pages what might be useful to add to the sphinx docs....
A couple of weeks ago I worked with Navpreet to start writing a [Maze Environment](https://github.com/Farama-Foundation/Minigrid/pull/317) for Minigrid which is mostly working. Finishing this PR and possibly adding a version that...
I wrote a few [wrappers for minigrid environments](https://github.com/jbloomAus/DecisionTransformerInterpretability/blob/c84edb381c53b3f9ef2fa9517e34914a52e15fbd/src/environments/wrappers.py). which should probably live in the [Minigrid](https://github.com/Farama-Foundation/Minigrid) github repo. If someone has the time to submit that as a PR and write...