Infermo
Infermo copied to clipboard
implemented std_grad in backward.mojo
@TilliFe @andresnowak please verify the implementation #9
From what i understand, the only thing that is incorrect is this a.grad.simd_load[_nelts](idx_a) * std_derivativeinstead of a multiplication it should be a sum, we have to accumulate the gradient in the a Tensor and doing the chain rule in this operation std_derivative * b.grad.load(idx_b)
Thanks @ManishAradwad :). I will merge this as soon as I am back. Leave this PR open.
From what i understand, the only thing that is incorrect is this
a.grad.simd_load[_nelts](idx_a) * std_derivativeinstead of a multiplication it should be a sum, we have to accumulate the gradient in the a Tensor and doing the chain rule in this operationstd_derivative * b.grad.load(idx_b)
Right! I'll correct it