cooper Extrapolation from the past

Extrapolation from the past

Open juan43ramirez opened this issue 2 years ago • 0 comments

Enhancement

Implement the extrapolation from the past algorithm (Popov, 1980). A good and modern source is Gidel et al. (2019).

This is an algorithm for computing parameter updates similar to extragradient: it computes the direction for updating parameters based on a "lookahead step". It is less intensive computationally than extragradient and enjoys similar convergence results for some class problems (Gidel et al., 2019).

Motivation

Whereas extragradient requires two gradient computations per parameter update, extrapolation from the past stores and re-uses gradients from previous extrapolation steps for use during current extrapolation steps. This means less computational intensity in terms of gradient calculations, which may be helpful in some settings.

However, storing the previous gradients still means an overhead in terms of storage as opposed to gradient descent-ascent.

References

G. Gidel, H. Berard, G. Vignoud, P. Vincent, S. Lacoste-Julien. A Variational Inequality Perspective on Generative Adversarial Networks. In ICLR, 2019.
L. D. Popov. A modification of the arrow-hurwicz method for search of saddle points. Mathematical notes of the Academy of Sciences of the USSR, 1980.

Nov 07 '22 18:11 juan43ramirez

cooper cooper copied to clipboard

Extrapolation from the past

Enhancement

Motivation

References

cooper
cooper copied to clipboard