otoworld
otoworld copied to clipboard
Refactor evaluation code
- Clean up
evaluation.ipynb
notebook- plug and play essentially (e.g. avoid thing where GT and EST sources getting mixed up/backwards)
- make it easy for people to evaluate their results
- Refactor
plot_runs.py
- so it stores data automatically when experiments are running
- makes no assumptions (e.g. line assuming validation episodes every 5 episodes)
- Anything else that makes it easier to evaluate the trained model
#11 is a natural follow-up to this