sgcrfpy
sgcrfpy copied to clipboard
add max number of backtracking iterations, and early stopping
look up the stopping criterion that uses norm of gradient and norm of loss!
sometimes, the algorithm will propose a step direction, and then spend a long time backtracking until the learning rate is machine epsilon. perhaps instead of taking a really small step in a bad direction, we should just stop?