torch-autograd Help with using autograd in training with wrapped NN modules

Let's say I have a whole wrapped network made with nn called 'model' and I used the

modelFunction, params = autograd.functionalize(model)
neuralNet = function(params, input, target) ...  return myCustomLoss end   
df = autograd(neuralNet)

Now I want to train my model. Since I have my training typical procedure (with the mini-batch closure) already written and ready, I would like to keep most of it, only exploiting the very easy way to get the gradients with autograd. So let's confront the 2 methods, so that you can tell if that is actually possible.

The usual:

local feval = function(x)
  if x ~= parameters then
     parameters:copy(x)
  end
  -- reset gradients
  gradParameters:zero()

  -- f is the average of all criterions
  local f = 0

  -- evaluate function for complete mini batch
  for i = 1,#inputs do
     -- estimate f
     local output = model:forward(inputs[i])
     local err = criterion:forward(output, targets[i])
     f = f + err

     -- estimate df/dW
     local df_do = criterion:backward(output, targets[i])
     model:backward(inputs[i], df_do)
  end
  -- normalize gradients and f(X)
  gradParameters:div(#inputs)
  f = f/#inputs

  -- return f and df/dX
  return f,gradParameters
  end

So, using autograd while making the smallest changes possible it would be:

-- create closure to evaluate f(X) and df/dX
local feval = function(x)
   -- get new parameters
   if x ~= parameters then
      parameters:copy(x)
   end
   -- reset gradients
   gradParameters:zero()

   -- f is the average of all criterions
   local f = 0

   -- evaluate function for complete mini batch
   for i = 1,#inputs do
      -- estimate f
      local df_do, err, output = df(params,inputs[i],targets[i])
      f = f + err
      model:backward(inputs[i], df_do)
   end

   -- normalize gradients and f(X)
   gradParameters:div(#inputs)
   f = f/#inputs
   -- return f and df/dX
   return f,gradParameters
end

And then going on with using the optim module in the classical way. Is this not possible/not suggested?

Jan 04 '17 05:01 synchro--

@synchro-- were you successful in doing this? I am mixing optim with wrapped nn modules and getting the following errors:

/Graph.lua:40: bad argument #2 to 'fn' (expecting number or torch.DoubleTensor or torch.DoubleStorage at /tmp/luarocks_torch-scm-1-9261/torch7/generic/Tensor.c:1125)

Mar 23 '17 10:03 sebastiangonsal

I have actually never tried in the end. I dropped the project because I was working on something else, but it's something I could try in the next weeks let's say. Keep me up to date if you manage to use optim that way.

On Thu, Mar 23, 2017, 11:06 sebastiangonsal [email protected] wrote:

@synchro-- https://github.com/synchro-- were you successful in doing this? I am mixing optim with wrapped nn modules and getting the following errors:

/Graph.lua:40: bad argument #2 to 'fn' (expecting number or torch.DoubleTensor or torch.DoubleStorage at /tmp/luarocks_torch-scm-1-9261/torch7/generic/Tensor.c:1125)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/twitter/torch-autograd/issues/169#issuecomment-288672004, or mute the thread https://github.com/notifications/unsubscribe-auth/AFmzun9DRyVOLkOXVLWNrPtXGd4Llmw6ks5rokQmgaJpZM4LaS2X .

Mar 23 '17 13:03 synchro--

I can confirm that it is possible to mix optim with wrapped nn modules. Errors that you hit might be features that autograd does not support.

Mar 28 '17 00:03 biggerlambda

@biggerlambda Thanks. Do you have any example of that?

Apr 19 '17 16:04 synchro--

torch-autograd torch-autograd copied to clipboard

Help with using autograd in training with wrapped NN modules

torch-autograd
torch-autograd copied to clipboard