Pontryagin-Differentiable-Programming
Pontryagin-Differentiable-Programming copied to clipboard
The learning rate of the inverse KKT method in the code is inconsistent with that in the paper.
hi @wanxinjin. I noticed that in the paper the learning rate for PDP, inverse KKT, and neural policy cloning methods in imitation learning was set to $\eta=10^{-4}$. But in scripts like "cartpole_inverseKKT.py", the parameter lr equals 1e-7. Why so?