zhangtj1996

Results 3 issues of zhangtj1996

Hi, I'm wondering what's the meaning of `mu[i] * self._T ` in the first part of likelihood. It's not consistent with the paper, which should be lambda*delta t

For optimizers like sgd+momentum, adam, rmsprop, they may use the historical information of the gradients. Does this implementation maintain / reset / interpolate the momentum in each outer loop?

Will RDC for redundancy be supported in the near future?