Simd icon indicating copy to clipboard operation
Simd copied to clipboard

Some functions in this library are slower than opencv4 5.5

Open SheepKeeper1990 opened this issue 2 years ago • 6 comments

Hi, First of all, thank you for your contribution. But when I used this library, I tested Simd:: Resize(),Simd:: BgrToGray and so on. Some functions are 4-6 times slower than OpenCV4.5.5. (OS: Ubuntu18.04 CPU: i7-10750H 12cores)

Did I miss anything when I used it? Could you give a use case of image processing to make it faster than opencv's function. If most of them are slower than opencv, what are the advantages of this library. Sincerely look forward to your answer.

SheepKeeper1990 avatar May 23 '22 07:05 SheepKeeper1990

Hi! Thank you for response. OpenCV uses all cpu cores by default. Simd specializes on single-thread performance (for example when resize is used in many threads).

ermig1979 avatar May 23 '22 10:05 ermig1979

Hi! Thank you for response. Does the SIMD library support multithreaded computing?Could you show me a simple example or pseudo code structure. Thanks again.

SheepKeeper1990 avatar May 24 '22 00:05 SheepKeeper1990

I tried to combine SIMD functions,such as Simd::ResizeBilinear() with OpenMP to achieve multi-core acceleration, but it didn't work. It takes 7.79ms to execute 10 times, while opencv4.5.5 is 0.73ms. How to improve this? omp_set_num_threads(12); #pragma omp parallel { Simd::ResizeBilinear(viewSrc, viewDst); }

SheepKeeper1990 avatar May 24 '22 01:05 SheepKeeper1990

Hi! Unfortunatily this does not work such way. I have to rewrite code of implementation of `ResizeBilinear'. I will add this issue into my future development plans.

ermig1979 avatar May 26 '22 06:05 ermig1979

Thank you again for your contribution。

SheepKeeper1990 avatar May 30 '22 08:05 SheepKeeper1990

Hi @ermig1979 Thank you for your contribution.

I have similar issues with debayering, its around 5 times slower than opencv debayering. I'm new to this but from what I understand opencv is using all my cpu cores for debayering vs Simd library uses a single thread for the operation? Any way if the above is true to increase performance or the library implementation has to change?