MIVisionX OpenVX Framework - Kernel execution on CPU

Hi,

I was going through the code to understand a bit the implementation and how kernels get executed in parallel on CPU in case the graph has nodes that can be executed in parallel. Am I wrong or all nodes/kernels of a graph get executed in serial fashion on single core? At least this is what I understand when looking at agoExecuteGraph() function. Maybe for OpenCL the situation is different.

Sep 10 '19 13:09 bogdanul2003

@bogdanul2003 the nodes in the graph execute serially on a single core. The nodes themselves can use the available cores to execute parallel computation. OpenCL nodes occupy the required number of CUs when launched on a GPU.

Sep 10 '19 20:09 kiritigowda

@bogdanul2003 just to add on what @kiritigowda, multiple sub_graphs can be created to run them in different cores. OpenCL always uses parallel threads on GPU

Sep 12 '19 23:09 rrawther

One thing to mention, at the moment my workload is CPU based only. If I understand correctly, I need to compile the framework with OpenCL support so that I can get nodes executed in parallel on different CPU cores ? My question, I saw that I forgot to mention this, was more related to the case when you compile without opencl support. @rrawther is it possible to run sub_graphs on different coreas without opencl? I couldn't figure out who decides which sub_graphs can be executed on different cores.

Sep 18 '19 10:09 bogdanul2003

@bogdanul2003 : Currently OpenCL implementation is only targeted for GPU only. We don't have an OpenCL implementation which gets executed in parallel on different CPU cores. Are you running on Windows or Linux? We have multithreading support for windows assuming you have separate graphs created for nodes which has to run in parallel. Because of data dependency most OpenVX graphs are executed sequentially in our current implementaion

Sep 19 '19 18:09 rrawther

Thanks @rrawther . I thought that it can figure out which nodes can be executed in parallel depending on how you build your graph. Do you plan to add this feature also to the framework for CPU only workloads? Do you know if other implementations of OpenVX (Nvidia or Intel) offer this kind of optimization ?

Sep 20 '19 09:09 bogdanul2003

@bogdanul2003 Once the nodes are submitted to GPU, they can run in parallel provided no data dependancy. OpenVX framework checks if the input data is ready before a node is executed. We don't have much insight into Intel or NVidia. But We will be adding enhancements to our implementation in future.

Sep 27 '19 00:09 rrawther

@rrawther thanks for the clarification. Can we keep this ticket open until this feature is added ?

Sep 27 '19 07:09 bogdanul2003

MIVisionX MIVisionX copied to clipboard

OpenVX Framework - Kernel execution on CPU

MIVisionX
MIVisionX copied to clipboard