renyiryry

Results 1 comments of renyiryry

I agree. Also want to kindly remind you that if sum is used instead of average, when computing mini-batch gradient, it might need to be re-scaled based on the size...