Michael Abbott
Michael Abbott
I wonder how much we'd have to change in Base to delete Tangent entirely. Being able to add them is the big virtue, what chance we could get these methods...
BTW, status is that I think it's probably too unsafe to load this onto `+`. It would be better to have a dedicated function for accumulation that it not used...
Whatever else it does, I think `add!!(x::Array, y::Array)` will always mutate `x`. I think this isn't acceptable for `accum` as, right now, we do not demand that `rrules` return something...
*Unsafe* like `+`, I'd say. In that it returns gradients for two arguments which may well alias each other. Done manually? You could wrap both in a ReadOnlyArray. But forgetting...
> I suspect it's important that the tangent only is an alias for another tangent, if the primal is also an alias for another primal. This strikes me as being...
More broadly, am I right to think the "functional" approach of how `rrule` works is more-or-less inherited from Zygote? The alternative would be to allocate all buffers on the forward...
You'd have to zero them all before the second backward pass. (After copying the contents to one slice of the output.) Tracker has will give an error if you accidentally...
> should be generically correct for any abstractarray object I think it needs to ensure a mutable copy, more like `b = copyto!(similar(a, T, axes(a)), a)`. This is poorly documented...
Yet `NoTangent() + 1.0` returns no error, so it won't help you find such a mistake. Nor does `ZeroTangent() + NoTangent()`, which can only occur if one primal has both....
One more thought. If `NoTangent` really encoded that this thing could never ever be perturbed, then it should win over other (mistaken) information, e.g. from a rule defined on some...