renyiryry
Results
1
comments of
renyiryry
I agree. Also want to kindly remind you that if sum is used instead of average, when computing mini-batch gradient, it might need to be re-scaled based on the size...