caffe
caffe copied to clipboard
argmax calculation on the GPU?
Hi,
I have found that the argmax calculation is a bottleneck. While the forward pass of my net needs just 18 ms, the argmax calculation needs about 40 ms (1024x544 px). I'm sure you could speed up this calculation if you calculate it on the GPU. Would it be possible for you to implement an argmax.cu? That would be really helpful! Thanks!
Best, Timo
Hi Timo, good idea, thank you!
Hi Nikolaev, are there any new developments? Can you foresee the date, when the argmax calculation is implemented on the GPU? Thanks a lot, Timo
Hi @TimoSaemann I looked through it and I can't estimate the time, sorry. I'll update this post when get some news.
Hi @drnikolaev I just want to mention, that I am still interested in an implementation of the argmax.cu layer. I would really be happy to hear some information about the current status. Thanks, Timo
Hi @TimoSaemann it's on my plate but there are few urgent bugs to fix before the next release, sorry about this.