clnn
clnn copied to clipboard
warning: unused function 'IndexToOffset_999_get'
I'm using clnn to train a resent model on intel GPU. When the training starts I see the warning below.
THClReduce.cl build log: .................................... 32/634 ...................] ETA: 0ms | Step: 0ms
<program source>:48:28: warning: unused function 'IndexToOffset_999_get'
static inline unsigned int IndexToOffset_999_get(unsigned int linearId, global const TensorInfoCl *info) {
^
THClReduce.cl build log:
<program source>:67:19: warning: unused function 'IndexToOffset_999_get'
static inline int IndexToOffset_999_get(int linearId, global const TensorInfoCl *info) {
^
THClReduceAll.cl build log:
<program source>:51:28: warning: unused function 'IndexToOffset_999_get'
static inline unsigned int IndexToOffset_999_get(unsigned int linearId, global const TensorInfoCl *info) {
^
<program source>:66:28: warning: unused function 'getLinearBlockId'
static inline unsigned int getLinearBlockId() {
^`
This is what I'm doing:
if opt.backend == 'cl' then
require 'clnn'
require 'cltorch'
net = net:cl()
--cudnn.convert(net, cudnn) --Convert the net to cudnn
-- What is the equivalent of cud.convert for clnn ?
criterion = criterion:cl()
end
Is above code right ? Is there anything else that I need to do in order to use my intel GPU ?
Also I see that - train Loss: nan which should be a number ? Should I also convert the training loss value to cl ?
What else needs to be converted to cl ?
Best, Pramod
you can ignore the warnings, but your loss should not be nan. opencl doesn't use cudnn.
On 27 March 2017 14:00:04 CEST, Pramod Solanky [email protected] wrote:
I'm using clnn to train a resent model on intel GPU. When the training starts I see the warning below.
`THClReduce.cl build log: .................................... 32/634 ...................] ETA: 0ms | Step: 0ms
:48:28: warning: unused function 'IndexToOffset_999_get' static inline unsigned int IndexToOffset_999_get(unsigned int linearId, global const TensorInfoCl *info) { ^ THClReduce.cl build log:
:67:19: warning: unused function 'IndexToOffset_999_get' static inline int IndexToOffset_999_get(int linearId, global const TensorInfoCl *info) { ^ THClReduceAll.cl build log:
:51:28: warning: unused function 'IndexToOffset_999_get' static inline unsigned int IndexToOffset_999_get(unsigned int linearId, global const TensorInfoCl *info) { ^ :66:28: warning: unused function 'getLinearBlockId' static inline unsigned int getLinearBlockId() { ^` This is what I'm doing:
if opt.backend == 'cl' then require 'clnn' require 'cltorch' net = net:cl() --cudnn.convert(net, cudnn) --Convert the net to cudnn -- What is the equivalent of cud.convert for clnn ? criterion = criterion:cl() end
Is above code right ? Is there anything else that I need to do in order to use my intel GPU ?
Also I see that - train Loss: nan which should be a number ? Should I also convert the training loss value to cl ?
What else needs to be converted to cl ?
Best, Pramod
-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/hughperkins/clnn/issues/45
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Thats right! My question was with cudnn
you convert your model to cudnn like this - cudnn.convert(net, cudnn). How do I convert with OpenCL ? Right now I'm just doing this net = net:cl()
. is there any other additional steps like cudnn has ?
I guess I got it. Looks like what I'm doing is fine. I saw an example here https://github.com/Element-Research/rnn/issues/41 . What I can't understand is that the loss is a nan and my test accuracy is way below expectation even after 60 epochs. Its 2.8% :( I'm using OpenCL because the model runs out of memory soon after I start training on CPU. And on GPU I get these problems like nan and very low test accuracy. Input images sizes are 224x224. Any suggestion ? PS: I'm trying out different networks like VGG, resent and alexnet. I could achieve an accuracy of 78% on cpu with VGG on CPU when the image sizes are 48x48.
Looks like this is causing issue - local loss = self.criterion:forward(output, target)
and when I print the loss, it shows inf
(infinity) which is why nan
(there's a division by num of epochs). Any ideas on this ?
Interesting thing is - when I run the same on CPU I get the loss just fine (just that after few minutes it goes out of memory). Not sure what I'm missing here. This is the code I'm using - https://github.com/chsasank/plantvillage-challenge