adda icon indicating copy to clipboard operation
adda copied to clipboard

Timing of matrix vector multiplication in OpenCL mode

Open GoogleCodeExporter opened this issue 10 years ago • 5 comments

The precise timing possibility of OpenCL matvec (removed in r1334) makes it 
hard to track issues with the OpenCL kernels on different devices. A desired 
goal would be to add an option to change the command queue into profiling mode 
and get the precise timings from events which are returned by 
clEnqueueNDRangeKernel. It should be possible to do this during runtime, so it 
can be implemented as option to adda directly instead of a compiler option 
using ifdefs. 
This would help to identify performance issues of the kernels on different 
devices.

r1334 - febb9ca148e8b25dfb05870743b38c780efc9fee

Original issue reported on code.google.com by [email protected] on 21 May 2014 at 7:22

GoogleCodeExporter avatar Aug 12 '15 07:08 GoogleCodeExporter