PyOpenCL-Tutorial
PyOpenCL-Tutorial copied to clipboard
Unfair comparison of timing in one of the examples
Dear Ben,
Nice examples, thanks for writing this!
The 030_timing.py
code is doing an unfair comparison of the speed of a slow python loop versus a compiled C-OpenCL code. In this case, the C-kernel will always be faster. A more fair comparison would be to compare the speed of a C-code that performs the sum vs the OpenCL version executed on the GPU. Does this make sense?
You could use numba.jit()
to optimise the cpu loop function and compare results.
Good point