caffe icon indicating copy to clipboard operation
caffe copied to clipboard

argmax calculation on the GPU?

Open TimoSaemann opened this issue 7 years ago • 5 comments

Hi,

I have found that the argmax calculation is a bottleneck. While the forward pass of my net needs just 18 ms, the argmax calculation needs about 40 ms (1024x544 px). I'm sure you could speed up this calculation if you calculate it on the GPU. Would it be possible for you to implement an argmax.cu? That would be really helpful! Thanks!

Best, Timo

TimoSaemann avatar Jul 12 '17 13:07 TimoSaemann

Hi Timo, good idea, thank you!

drnikolaev avatar Jul 30 '17 07:07 drnikolaev

Hi Nikolaev, are there any new developments? Can you foresee the date, when the argmax calculation is implemented on the GPU? Thanks a lot, Timo

TimoSaemann avatar Aug 21 '17 13:08 TimoSaemann

Hi @TimoSaemann I looked through it and I can't estimate the time, sorry. I'll update this post when get some news.

drnikolaev avatar Aug 29 '17 02:08 drnikolaev

Hi @drnikolaev I just want to mention, that I am still interested in an implementation of the argmax.cu layer. I would really be happy to hear some information about the current status. Thanks, Timo

TimoSaemann avatar Nov 30 '17 11:11 TimoSaemann

Hi @TimoSaemann it's on my plate but there are few urgent bugs to fix before the next release, sorry about this.

drnikolaev avatar Dec 06 '17 23:12 drnikolaev