CNTK icon indicating copy to clipboard operation
CNTK copied to clipboard

Passing input from GPU for inference

Open solarflarefx opened this issue 5 years ago • 3 comments

I am working in a pipeline architecture where all intermediary data resides in the GPU. I am interested in passing data already in the GPU into a DL network with the output still residing in the GPU to be passed to the next element in a pipeline. Essentially I would like to pass a cuda array into the network, get an output and convert back into a cuda array. I’d like to do this without having to perform CPU/GPU copies. Is this possible with CNTK?

solarflarefx avatar Dec 30 '19 02:12 solarflarefx

I believe yes, using C++ API where you have direct memory access. Unfortunately the C++ API is not documented like the Python API.

haixpham avatar Dec 30 '19 10:12 haixpham

Would this also be possible from the C# API? I am currently running the pipeline using managedCuda: https://kunzmi.github.io/managedCuda/

solarflarefx avatar Jan 02 '20 20:01 solarflarefx

I don't have experience with CUDA in C#, unfortunately. But you may find the way from this regression example. Look at GenerateValueData()

Basically, you pass memory to Value variables, that are bound to InputVariable as inputs to the network.

haixpham avatar Jan 02 '20 21:01 haixpham