CNTK Passing input from GPU for inference

Passing input from GPU for inference

Open solarflarefx opened this issue 5 years ago • 3 comments

I am working in a pipeline architecture where all intermediary data resides in the GPU. I am interested in passing data already in the GPU into a DL network with the output still residing in the GPU to be passed to the next element in a pipeline. Essentially I would like to pass a cuda array into the network, get an output and convert back into a cuda array. I’d like to do this without having to perform CPU/GPU copies. Is this possible with CNTK?

Dec 30 '19 02:12 solarflarefx

I believe yes, using C++ API where you have direct memory access. Unfortunately the C++ API is not documented like the Python API.

Dec 30 '19 10:12 haixpham

Would this also be possible from the C# API? I am currently running the pipeline using managedCuda: https://kunzmi.github.io/managedCuda/

Jan 02 '20 20:01 solarflarefx

I don't have experience with CUDA in C#, unfortunately. But you may find the way from this regression example. Look at GenerateValueData()

Basically, you pass memory to Value variables, that are bound to InputVariable as inputs to the network.

Jan 02 '20 21:01 haixpham

CNTK CNTK copied to clipboard

Passing input from GPU for inference

CNTK
CNTK copied to clipboard