PDE-Net
PDE-Net copied to clipboard
Optimization Method
I was wondering why did you use BFGS optimization instead of inbuilt ADAM/Gradient descent optimization method in pytorch?
I was wondering why did you use BFGS optimization instead of inbuilt ADAM/Gradient descent optimization method in pytorch?
BFGS with line search converges quicker. ADAM and SGD with a fixed learning rate are not stable.