neuralconvo The memory usage skyrocket each time it saves

The memory usage skyrocket each time it saves

Open ljyloo opened this issue 8 years ago • 3 comments

And it doesn't free the memory. I executed this bash command

th train.lua --cuda --dataset 50000 --hiddenSize 1000

First epoch it consumed 2GiB Ram, and second it consumed 5GiB, then 10GiB and finally my memory was full at 11th epoch. (My computer have 32 GiB of ram)

This issue disappeared when I commented out line 156 to 171 in train.lua(The ram usage is always at 1.2GiB)

  if minMeanError == nil or errors:mean() < minMeanError then
    print("\n(Saving model ...)")
    params, gradParams = nil,nil
    collectgarbage()
    -- Model is saved as CPU
    model:float()
    torch.save("data/model.t7", model)
    collectgarbage()
    if options.cuda then
      model:cuda()
    elseif options.opencl then
      model:cl()
    end
    collectgarbage()
    minMeanError = errors:mean()
  end

So I conclude the saving process may be the problem

Dec 22 '16 19:12 ljyloo

Seems to occur in the calls to model:float(). My workaround was to just save in GPU format:

  if minMeanError == nil or errors:mean() < minMeanError then
    print("\n(Saving model ...)")
    params, gradParams = nil,nil
    collectgarbage()
    torch.save("data/model.t7", model)
    collectgarbage()
    minMeanError = errors:mean()
  end

I then added require 'cudnn' to the top of eval.lua in order to be able to load the saved model. If you want to save the model in CPU format, you could write a quick script to load the model, call model:float(), and save it again.

Jan 09 '17 18:01 Namburgesas

Thanks for your simple solution, @Namburgesas . Hope there's a fix in the future

Jan 10 '17 15:01 ljyloo

Did you try doing clearState() before using model:float(). It clears the intermediary states in the model (not needed for prediction)

Jan 10 '17 20:01 biggerlambda

neuralconvo neuralconvo copied to clipboard

The memory usage skyrocket each time it saves

neuralconvo
neuralconvo copied to clipboard