nntrainer icon indicating copy to clipboard operation
nntrainer copied to clipboard

[bn layer] derivative of deviation

Open lhs8928 opened this issue 1 year ago • 1 comments

When it comes to think of the derivative of deviation(2) there are 2 incoming derivative. One from the variance(3) and the other from intermediate output(5). These 2 incoming derivative should be merged and the be averaged. But current implementation averaged the derivative from intermediate output(5) and then subtract(merge) the other(derivative from variance). I think this might leads to different result.

Please check following diagram.

image

(1) input (2) deviation (3) variance (4) inv_std
(5) intermediate output (6) gamma (7) beta (8) output

lhs8928 avatar Aug 09 '22 06:08 lhs8928

:octocat: cibot: Thank you for posting issue #1977. The person in charge will reply soon.

taos-ci avatar Aug 09 '22 06:08 taos-ci