Joseph Bloom

Results 41 issues of Joseph Bloom

I failed to get the TransformerPPO model working. I suspect this is because of bugs, but it could also be that transformers are inherently unstable. If someone is interested in...

My initial plan was to look at how decision transformers solved various tasks and compare this to trajectory transformers training with online RL (via PPO), however this may be difficult...

enhancement
important

I recently wrote a version of get_minibatches in the memory class of the ppo subpackage. https://github.com/jbloomAus/DecisionTransformerInterpretability/blob/c84edb381c53b3f9ef2fa9517e34914a52e15fbd/src/ppo/memory.py#L210-L393 TLDNR: This is important for sampling sections of trajectories which is necessary for online...

enhancement
good first issue

I currently host [a public version of the interpretability app](https://jbloomaus-decisiontransformerinterpretability-app-4edcnc.streamlit.app/) with the streamlit community, which is very slow. I would be willing to pay for a service to host the...

enhancement
streamlit-app

It's now possible to allow columns (up to one level of nesting) inside columns in streamlit apps thanks to https://github.com/streamlit/streamlit/pull/5941. I think this creates an oppurtunity to make the streamlit...

streamlit-app

Selective Noise Injection (SNI) and Information Bottleneck Actor-Critic (IBAC) make models better at generalising (including in at least one MiniGrid environment). It seems like a fun hack-dayish kind of effort...

crazy project ideas

This is a bug in the trajectorywriter/offline dataset where we end up truncating some trajectories when we finish online training and this leads to having “short” truncated trajectories, which are...

bug

I've explained how to use the repo in the docs, but this could be way better. Some examples of pages what might be useful to add to the sphinx docs....

documentation

A couple of weeks ago I worked with Navpreet to start writing a [Maze Environment](https://github.com/Farama-Foundation/Minigrid/pull/317) for Minigrid which is mostly working. Finishing this PR and possibly adding a version that...

minigrid

I wrote a few [wrappers for minigrid environments](https://github.com/jbloomAus/DecisionTransformerInterpretability/blob/c84edb381c53b3f9ef2fa9517e34914a52e15fbd/src/environments/wrappers.py). which should probably live in the [Minigrid](https://github.com/Farama-Foundation/Minigrid) github repo. If someone has the time to submit that as a PR and write...

minigrid