CNTK
CNTK copied to clipboard
Passing input from GPU for inference
I am working in a pipeline architecture where all intermediary data resides in the GPU. I am interested in passing data already in the GPU into a DL network with the output still residing in the GPU to be passed to the next element in a pipeline. Essentially I would like to pass a cuda array into the network, get an output and convert back into a cuda array. I’d like to do this without having to perform CPU/GPU copies. Is this possible with CNTK?
I believe yes, using C++ API where you have direct memory access. Unfortunately the C++ API is not documented like the Python API.
Would this also be possible from the C# API? I am currently running the pipeline using managedCuda: https://kunzmi.github.io/managedCuda/
I don't have experience with CUDA in C#, unfortunately. But you may find the way from this regression example. Look at GenerateValueData()
Basically, you pass memory to Value variables, that are bound to InputVariable as inputs to the network.