NGD-SGD
NGD-SGD copied to clipboard
optimize.step() takes too long compare to kaldi nnet3 NGD
I guess most time spent on doing the reverse of matrix. But NGD of kaldi nnet3 is fast, why?
I have read povey's that paper, have read some code of yours and kaldi nnet3 Precondition but not dig in too much. Please show me some reasons, and anyway to optimize, thx