neural-style
neural-style copied to clipboard
TVLoss supports multi-batchsize?
I am wondering whether the code does the correct stuff.
-- multi-batchsize version?
function TVLoss:updateGradInput(input, gradOutput)
self.gradInput:resizeAs(input):zero()
local N, C, H, W = input:size(1), input:size(2), input:size(3), input:size(4)
self.x_diff:resize(4, H - 1, W - 1)
self.y_diff:resize(4, H - 1, W - 1)
self.x_diff:copy(input[{{}, {}, {1, -2}, {1, -2}}])
self.x_diff:add(-1, input[{{}, {}, {1, -2}, {2, -1}}])
self.y_diff:copy(input[{{}, {}, {1, -2}, {1, -2}}])
self.y_diff:add(-1, input[{{}, {}, {2, -1}, {1, -2}}])
self.gradInput[{{}, {}, {1, -2}, {1, -2}}]:add(self.x_diff):add(self.y_diff)
self.gradInput[{{}, {}, {1, -2}, {2, -1}}]:add(-1, self.x_diff)
self.gradInput[{{}, {}, {2, -1}, {1, -2}}]:add(-1, self.y_diff)
self.gradInput:mul(self.strength)
self.gradInput:add(gradOutput)
return self.gradInput
end
This does not make sense to me:
self.x_diff:resize(4, H - 1, W - 1)
self.y_diff:resize(4, H - 1, W - 1)
I assume the sizes should match the input, except for H-1 ja W-1:
self.x_diff:resize(N, C, H - 1, W - 1)
self.y_diff:resize(N, C, H - 1, W - 1)
I see that Justin's original code has
self.x_diff:resize(3, H - 1, W - 1)
which makes sense because the code assumes that RGB images are used (C is always 3).
Wow, you are right. Your code works as expected. I wonder whether it is nicer to change the original code to
self.x_diff:resize(C, H - 1, W - 1)
self.y_diff:resize(C, H - 1, W - 1)