RLs
RLs copied to clipboard
Check that the code implementation is accurate and reasonable
- [x] check and fix C51 [deaab73]
- [x] check qrdqn [deaab73]
- [ ] check iqn
- [ ] check and fix Rainbow
- [ ] check on-policy buffer sampling
- [ ] check function
discounted_sum - [ ] check function
calculate_td_error - [ ] checke whether works well when training with visual input
- [ ] fix TRPO that step_size sometime be
nan - [ ] check
vdnandqmix
- [x] 检查将代码中关于运算维度的选择(dim/axis)把能设置为-1的都设置为-1。