Joseph Bloom issues

Results 41 issues of


                                            Joseph Bloom

Create an Object-Vector Calculator in the Streamlit App

Similar to the Maze experiments done by the shard theory team, calculate a vector corresponding to the key/ball in the memory env (or make a more general tool) that can...

Investigate whether anyone else does/ just experiment with finetuning of PPO models without entropy at the end of training to remove entropy optimising behaviors.

See discussion in #43

Update the app to also work with BC models

The app likely doesn't work with behavioural clone models and this should be fixed so that we can do a comparison.

Add isort to the pre-commit config yaml without causing an issue with black

I attempted this but weirdly got a clash with black even if profile = black with isort.

Add checkpoints during Offline Training

Our PPO models are now stored with checkpoints but our offline trained models aren't. Creating some parity here would be good. Please ensure: - [ ] It remains easy to...

help wanted

Major: Convert the codebase into an installable python package.

This is an important task I wouldn't want to do if there was lots of other work in flight or before I had some more results. However, I wanted to...

Update ppo checkpoints code to upload each checkpoint in real time rather than all at the end of the workflow

Currently `store_model_checkpoint` will add files to be uploaded to wandb to an artifact which is uploaded at the end of the ppo training cycle. It would be better if these...

good first issue

work out a nice way to manage CLI args across all the new PPO Model

The end-end tests show how to pass arguments to the different PPO models (FC, Transformer, LSTM) but it might nice to have a command line tool that handles all of...

Write a Probe Environment that tests a models ability to look at previous observations

We currently have 5 probe environments for single timestep models and I'd like a prob environment to test if a model can learn: 1. to take the correct action as...

Make it possible to change the grid size in the TrajectoryLSTM model to something other than 7

**more details to add later or message me** Might be valuable for training performance.

good first issue