pytorch_forward_forward
pytorch_forward_forward copied to clipboard
First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you.
First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you.
Backpropagation requires gradients to flow back across layers, and hence does not enjoy the locality update feature. Here, gradients, however, don't flow across layers and are used for local updates only. For this reason, the layers have decoupled backwards passes, and this is why these lines of code perform this local update by involving the layer cost function. This can be perceived as 1-layer backpropagation and can be substituted with other estimation/optimization methods that do not require gradients. In other words, this code is using local gradients without loss of generality and is not against Hinton's claim.
Other estimation/optimization methods that do not require gradients <- any suggestions?