Adam Gleave comments

Results 172 comments of


                                            Adam Gleave

Weird behavior training CNN policies with train_rl

@PavelCz is this still a live issue? If so, could you maybe summarize what needs to be in the default config and I'll assign it to someone?

Infinite-Horizon Environments not Supported

Thanks for opening the ticket! I don't think we ever said we supported infinite-horizon environments, so I don't view lack of support as a bug. I agree the limitation should...

Infinite-Horizon Environments not Supported

> Thanks for the response! I'm relatively new to RL and didn't know that infinite horizon environments could cause issues with learning and with the standard logger and eval setup....

Infinite-Horizon Environments not Supported

Sure, do let me know how you get on. Otherwise, I agree we should document that we only support finite-horizon environments -- I'll assign this to someone.

Review and fix flaky tests

38% is a lot, I'd expect to see errors happening frequently in CI. This happens even on the master branch?

Review and fix flaky tests

Ah 3.9% failures is believable we wouldn't have noticed. Agreed we should fix this, and probably don't want it to be randomly skipped either...

Review and fix flaky tests

Flakiness benchmarking would be great -- thanks for looking into this @Rocamonde! Agree let's try to avoid fixing the seed. We can always in some cases reduce the threshold required...

I know he made a start in https://github.com/HumanCompatibleAI/imitation/pull/584 but I think those were somewhat confounded (e.g. resource constraints from parallelism causing not usually flaky tests to also fail).

Replace imitation environments with seals

Note will need to remove mentions of `envs` from the docs (especially once https://github.com/HumanCompatibleAI/imitation/pull/525/ is merged)

Set a reasonable default decay rate for EMANorm

@yawen-d is this still an unresolved issue or are the defaults sensible now that we have https://github.com/HumanCompatibleAI/imitation/pull/546 ?