pytorch-meta-optimizer icon indicating copy to clipboard operation
pytorch-meta-optimizer copied to clipboard

Out of memory when the meta optimizer updates parameters

Open zengxianyu opened this issue 6 years ago • 3 comments

Hello, I find your code very helpful, but too much memory is consumed when the meta optimizer updates parameters of the model. On my computer, it always raises an error 'out of memory' when executes Line 140 of meta_optimizer.py.

I think it could consume less memory if the MetaModel class holds a flat version of parameters instead of wrapping a model. In this way, the MetaModel reshapes the parameters and computes result through nn.functional.conv/linear, so that the meta optimizer can directly use this flat version of parameters, without allocating extra memory for flatted parameters.

zengxianyu avatar Aug 24 '17 08:08 zengxianyu

I have kind of the same issue. On the line of code: flat_params = self.f * flat_params - self.i * Variable(flat_grads), my computer take a lot of time (making the computation graph for 25000 parameters) and then I can't print flat_params (in normal running or in debugger mode). I think my mac just don't have enought memory. A GPU is required to train meta-optimizer.

Forbu avatar Feb 09 '18 09:02 Forbu

Nevermind that was not the problem, the problem was certainly version change in pytorch and so the operation: flat_params = self.f * flat_params - self.i * Variable(flat_grads) produce a 25450*25450 matrix (not support by my computer). I change to:

        flat_params = torch.t(self.f) * flat_params - torch.t(self.i) * Variable(flat_grads)
        flat_params = flat_params.view(-1)

and it works

Forbu avatar Feb 09 '18 10:02 Forbu

Sorry, I didn't have enough time recently to fix problems with this code.

Could you submit a PR with this fix?

ikostrikov avatar Feb 09 '18 12:02 ikostrikov