micrograd icon indicating copy to clipboard operation
micrograd copied to clipboard

For addition adding incrementing grading makes sense, I can't make sense out of the incrementing it for multiplication too, potential bug?

Open srik-git opened this issue 1 year ago • 1 comments

def mul(self, other): other = other if isinstance(other, Value) else Value(other) out = Value(self.data * other.data, (self, other), '*')

    def _backward():
        self.grad += other.data * out.grad
        other.grad += self.data * out.grad
    out._backward = _backward

    return out

If you have an expression of type (xy)(x*z) then the gradient w.r.t x is not additive, right?

srik-git avatar Jun 14 '24 19:06 srik-git

I don't get what you mean by the expression '(xy)(x*z)' but here is the logic behind incrementing the previous value of the gradient:

Consider an expression like y = (a * b) + (a * c).

When we are evaluating the expression (a*b) to find the gradient of y with respect to a and b, we say that the gradient of y with respect to a is out.grad * b (for this example out.grad will be 1 at that point) and the gradient of y with respect to b is a * out.grad.

So what we currently have is a.grad = b b.grad = a

Then when we are trying to evaluate the second expression (a * c) by a similar procedure, we find

c.grad = a

but here we should not say a.grad = c. We should increment the previous a.grad by c. So, a.grad += c.

In the end we should have:

a.grad = b+c b.grad = a c.grad = c

Obviously this is what we expect with regular calculus.

I hope this clears things out for you.

daghanerdonmez avatar Jun 20 '24 11:06 daghanerdonmez