Daniel Filan
Daniel Filan
Agree with above, and think that specifying the inner batch size makes more sense.
- In particular, do we know which cells in particular are taking too long to run? - In general, it seems like CI tests take way longer to run on...
TBC, linked example is from a branch where I'm testing some atari environments - I expect CI tests on master to be a bit quicker (but would be surprised if...
Currently in the process of adding tests to `test_reward_nets.py` that test the CnnRewardNet (as well as getting the example notebook runtime low enough that I stop getting CellTimeoutErrors on the...
I think by and large this is independent of @tomtseng's PR - biggest interaction is that it looks like CNNs should be wrapped by default.
Given discussion [here](https://github.com/HumanCompatibleAI/imitation/issues/486), will get CNNs to always transpose, rather than conditionally doing so.
Given discussion [here](https://github.com/HumanCompatibleAI/imitation/issues/486#issuecomment-1211183061), I've added a flag at the creation of the reward net to control transposition behaviour.
Not sure what's going on with code coverage, but I think this is ready for review by @norabelrose and/or @AdamGleave when he gets back.
> Probably better to avoid forks in the future if possible though. Yep - I think when I started the branch I wasn't able to make one in HumanCompatibleAI, but...
Added wrappers to atari environments to make them constant length. LMK if you think there are too many environments here, or if this should be in seals or something.