imagenet-multiGPU.torch icon indicating copy to clipboard operation
imagenet-multiGPU.torch copied to clipboard

A question about predicting individual images

Open jsongcse opened this issue 9 years ago • 3 comments

Good morning!

Thank you for sharing your work. It is really helpful.

May I ask a question about predicting individual images? I tried to do it by following the code on the ReadMe file, but there were some errors. My full code is:

require 'torch'
require 'cutorch'
require 'paths'
require 'xlua'
require 'optim'
require 'nn'
require 'cudnn'
require 'cunn'

torch.setdefaulttensortype('torch.FloatTensor')
local opts = paths.dofile('opts.lua')
opt = opts.parse(arg)

paths.dofile('donkey.lua')
img = testHook({loadSize}, './cr-test.jpg')
model = torch.load('./model_1.t7')
model:evaluate()
if img:dim() == 3 then
    img = img:view(1, img:size(1), img:size(2), img:size(3))
end

-- the next line causes error
predictions = model:forward(img:cuda())

and, the error message is:

-- ignore option data   
-- ignore option optimState 
-- ignore option cache  
-- ignore option netType    
-- ignore option retrain    
Loading train metadata from cache   
Loading test metadata from cache    
Loaded mean and std from cache. 
/home/jaewoo/programs/torch/install/bin/luajit: ...oo/programs/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 2 module of nn.Sequential:
In 4 module of nn.Sequential:
...torch/install/share/lua/5.1/cudnn/BatchNormalization.lua:44: assertion failed!
stack traceback:
    [C]: in function 'assert'
    ...torch/install/share/lua/5.1/cudnn/BatchNormalization.lua:44: in function 'createIODescriptors'
    ...torch/install/share/lua/5.1/cudnn/BatchNormalization.lua:64: in function <...torch/install/share/lua/5.1/cudnn/BatchNormalization.lua:63>
    [C]: in function 'xpcall'
    ...oo/programs/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
    ...o/programs/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <...o/programs/torch/install/share/lua/5.1/nn/Sequential.lua:41>
    [C]: in function 'xpcall'
    ...oo/programs/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
    ...o/programs/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    main-temp.lua:32: in main chunk
    [C]: in function 'dofile'
    ...rams/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

WARNING: If you see a stack trace below, it doesn't point to the place where this error occured. Please use only the one above.
stack traceback:
    [C]: in function 'error'
    ...oo/programs/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
    ...o/programs/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    main-temp.lua:32: in main chunk
    [C]: in function 'dofile'
    ...rams/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

Would you help me? Thank you for reading!

jsongcse avatar Jul 07 '16 07:07 jsongcse

For anyone else having the same problem, the nn.View module in the model file you're loading (alexnetowtbn.lua, alexnet.lua, ...) is the problem. The input to it is of dimensions batch_size x depth x width x height.

When you classify one image, batch_size = 1, so rather than outputting batch_size x (depth * width * height) as the minibatch that nn.BatchNormalization expects, it outputs depth * width * height. Therefore, your BN throws an assertion error.

To fix this, you have to specify the number of non batch dimensions to be one fewer than the total number of input dimensions. So change the classifier to classifier:add(nn.View(depth * width * height):setNumInputDims(3)).

As for the model you've already trained, you can load it into the Torch interpreter, change that classifier layer manually and then save it back to disk.

hsoule avatar Aug 09 '16 15:08 hsoule

@hsoule Thank you for your kind comment!

Before I read your comment, I understood the problem was about batch size but I did not know how to solve it. So I just bypassed the problem by making a batch of size 2 with an image by duplicating the image.

Thank you for telling me how to fix the problem!

jsongcse avatar Aug 16 '16 06:08 jsongcse

Is it fair to expect nn/cudnn.BatchNormalization to have non-batched operation, especially in evaluate mode? Perhaps we can add this feature?

ajdroid avatar Sep 27 '16 13:09 ajdroid