pytorch_forward_forward icon indicating copy to clipboard operation
pytorch_forward_forward copied to clipboard

First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you.

Open zym1599 opened this issue 2 years ago • 2 comments

First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you.

zym1599 avatar Jan 11 '23 08:01 zym1599

Backpropagation requires gradients to flow back across layers, and hence does not enjoy the locality update feature. Here, gradients, however, don't flow across layers and are used for local updates only. For this reason, the layers have decoupled backwards passes, and this is why these lines of code perform this local update by involving the layer cost function. This can be perceived as 1-layer backpropagation and can be substituted with other estimation/optimization methods that do not require gradients. In other words, this code is using local gradients without loss of generality and is not against Hinton's claim.

makrout avatar Jan 20 '23 18:01 makrout

Other estimation/optimization methods that do not require gradients <- any suggestions?

taomanwai avatar May 28 '24 09:05 taomanwai