HandyRL
HandyRL copied to clipboard
(Change Default Outputs) feature: change default learning rate to 3e-6 * sqrt(batch_size)
The learning rate proportional to batch size looks strange.